0

In the following dataset, I need to convert each row for the “description” under “name" column (for example, inventory1, inventory2 and inventory3) into two separate columns (namely description1 and description2, respectively). If I used either pviot_table or groupby, the value of the description will become header instead of a value under a column. What would be the way to generate the desired output? Thanks

import pandas as pd
df1 = { 'item':['item1','item2','item3','item4','item5','item6'],
  'name':['inventory1','inventory1','inventory2','inventory2','inventory3','inventory3'],
  'code':[1,1,2,2,3,3],
  'description':['sales number decrease compared to last month', 'Sales number 
decreased','sales number increased','Sales number increased, need to keep kpi','no sales this 
month','item out of stock']}

df1=pd.DataFrame(df1)

desired output as below: enter image description here

2
  • Your example is not clear. You talk about "the same inventory item", but how to choose the value for the "item" column? For example, in the first row: why "item1" and not "item2"? Also, will you always have exactly two rows for the same inventory? Commented Mar 20, 2022 at 19:54
  • I modified the question slightly. Need to convert the description under each “name” column into two columns. Also, I will not always have 2 rows for each inventory (under “name” column), sometimes I have 3 or 4 but I didn’t want to make the question too confusing. If there’s a way to do this(automatically convert different numbers of rows into corresponding numbers of columns) , it’d be super helpful. Commented Mar 20, 2022 at 19:59

1 Answer 1

1

You can actually use pd.concat:

new_df = pd.concat([
        (
            df.drop_duplicates('name')
            .drop('description', axis=1)
            .reset_index(drop=True)
        ),
        (
            pd.DataFrame([pd.Series(l) for l in df.groupby('name')['description'].agg(list).tolist()])
            .add_prefix('description')
        ),
    ],
    axis=1)

Output:

>>> new_df
    item        name  code                                  description0                              description1
0  item1  inventory1     1  sales number decrease compared to last month                    Sales number decreased
1  item3  inventory2     2                        sales number increased  Sales number increased, need to keep kpi
2  item5  inventory3     3                           no sales this month                         item out of stock
One-liner version of the above, in case you want it:
pd.concat([df.drop_duplicates('name').drop('description', axis=1).reset_index(drop=True), pd.DataFrame([pd.Series(l) for l in df.groupby('name')['description'].agg(list).tolist()]).add_prefix('description')], axis=1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.