2

My dataset has some information about price and sales for different years. The problem is each year is actually a different column header for price and for sales as well. For example the CSV looks like

Items Price in 2018 Price in 2019 Price in 2020 Sales in 2018 Sales in 2019 Sales in 2020
A 100 120 135 5000 6000 6500
B 110 130 150 2000 4000 4500
C 150 110 175 1000 3000 3000

I want to show it something like this

Items Year Price Sales
A 2018 100 5000
A 2019 120 6000
A 2020 135 6500
B 2018 110 2000
B 2019 130 4000
B 2020 150 4500
C 2018 150 1000
C 2019 110 3000
C 2020 175 3000

I used melt function from Pandas like this df.melt(id_vars = ['Items'], var_name="Year", value_name="Price")

But I'm struggling in getting separate columns for Price and Sales as it gives Price and Sales in one column. Thanks

1 Answer 1

3

Let us try pandas wide_to_long

pd.wide_to_long(df, i='Items', j='year', 
                stubnames=['Price', 'Sales'], 
                suffix=r'\d+', sep=' in ').sort_index()

              Price Sales
Items year              
A     2018    100   5000
      2019    120   6000
      2020    135   6500
B     2018    110   2000
      2019    130   4000
      2020    150   4500
C     2018    150   1000
      2019    110   3000
      2020    175   3000
Sign up to request clarification or add additional context in comments.

7 Comments

What if I have few more columns for example Location, Country then how should I use this function for them as well those will remain same as Items. Thanks
@Suchi Add the corresponding columns in the stubnames list for e.g stubnames=['Price', 'Sales', 'Country', 'Location']
Would it work if I use Location and Country in stubname as that comes in two different columns like Items , So need to make changes in Location and Country columns.
I'm getting this error in wide_to_long raise ValueError("stubname can't be identical to a column name")
If you dont want to specify then as stubnames you can specify them as id variables. Please check pd.wide_to_long(df, i=['Items', 'Country', 'Location'], j='year', stubnames=['Price', 'Sales'], suffix=r'\d+', sep=' in ').sort_index()
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.