Creating a Column from a Specific Value Contained in a Row

Question

I have a data frame that is formatted like this:

details	col_1	col2	col3
ex1 2019 test	1	1	1
ex1 2020 review	2	2	2
example2 2021 survey	3	3	3
row3 2019 data	4	4	4

I want to create a new column called "Year" appended to the end of this data frame that takes the year value from the row name. I want it to look like this:

details	col_1	col2	col3	Year
ex1 2019 test	1	1	1	2019
ex1 2020 review	2	2	2	2020
example2 2021 survey	3	3	3	2021
row3 2019 data	4	4	4	2019

The row names are unstandardized on purpose to reflect my actual data. Thanks in advance for the help!

constantstranger · Accepted Answer · 2022-07-26 21:19:45Z

1

This will work:

df['Year'] = df.details.str.extract(r'\b(\d{4})\b').astype(int)

Output:

                details  col_1  col2  col3  Year
0         ex1 2019 test      1     1     1  2019
1       ex1 2020 review      2     2     2  2020
2  example2 2021 survey      3     3     3  2021
3        row3 2019 data      4     4     4  2019

edited Jul 26, 2022 at 21:19

answered Jul 26, 2022 at 21:13

constantstranger

9,4092 gold badges9 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

constantstranger Over a year ago

@Pranav Hosangadi good idea - updated.

constantstranger Over a year ago

Do you need any more help with your question?

Ricardo · Accepted Answer · 2022-07-26 21:15:03Z

0

from dateutil.parser import parse
df['Year'] = df.apply(lambda row: parse(row.details, fuzzy=True).year, axis=1)

answered Jul 26, 2022 at 21:15

Ricardo

6914 silver badges12 bronze badges

Collectives™ on Stack Overflow

Creating a Column from a Specific Value Contained in a Row

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related