0

I have list like this, I loaded from xlsx file

import pandas as pd
travel_df = pd.read_excel('./item.xlsx')
data = travel_df.to_dict('records')

the data like this

data = 
[
    {
        'cat': 'A',
        'subCat': 'a1',
    },
    {
        'cat': 'A',
        'subCat': 'a2',
    },
    {
        'cat': 'B',
        'subCat': 'b1',
    },
    {
        'cat': 'B',
        'subCat': 'b2',
    },
    {
        'cat': 'B',
        'subCat': 'b3',
    },
]

I want to put this into CSV file like this, what is the best and fastest way to do that

A     B
--------
a1    b1
a2    b2
      b3

1 Answer 1

1

You can do this by DataFrame() method,pivot() method and apply() method:

newdf=pd.DataFrame(data).pivot(columns='cat',values='subCat').apply(lambda x:sorted(x,key=pd.isna))

Finally filter out NaN's:

newdf=newdf[~newdf.isna().all(1)]

Output of newdf:

cat   A     B
0     a1    b1
1     a2    b2
2     NaN   b3

Now if you want to save this in csv file then use to_csv() method

Sign up to request clarification or add additional context in comments.

2 Comments

how to avoid duplicate records?
just use drop_duplicates() method i.e : newdf=newdf.drop_duplicates()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.