Load Json Nested to Dataframe Pandas

Question

stackoverflow, please help me. i have json file.json like this.

{"info":[
    {"name":"LYDIANA","address":[{"home":"San Francisco"},{"work":"Carolina"}],"emails":[],"phones":[{"work":"1234567"},{"home":"4323455"}]},
    {"name":"John Doe","address":[{"home":"Laguna"},{"work":"Ivory"}],"emails":[{"email":"[email protected]"},{"email":"[email protected]"}],"phones":[{"work":"5435435"},{"work":"8678678"}]}
]}

how to create dataframe pandas like this?

name        address                                 phones
LYDIANA     home: San Francisco | work: Carolina    1234567
LYDIANA     home: San Francisco | work: Carolina    4323455
John Doe    home: Laguna | work: Ivory              5435435
John Doe    home: Laguna | work: Ivory              8678678

it doesn't make much sense to merge home and work address into single string (and repeat it) while keeping phone on separate lines without any indication is it work or home phone — buran
– buran, Commented Nov 1, 2020 at 9:16
@buran it's possible, you can see the answer in this thread my friend — Hendra
– Hendra, Commented Nov 1, 2020 at 9:22
I did not say it's not possible, just that it does not make sense. You loose important information about phone and at the same time keep redundant duplicate information about address. — buran
– buran, Commented Nov 1, 2020 at 9:46

Andrej Kesely · Accepted Answer · 2020-11-01 09:11:03Z

import pandas as pd

dct = {"info":[
    {"name":"LYDIANA","address":[{"home":"San Francisco"},{"work":"Carolina"}],"emails":[],"phones":[{"work":"1234567"},{"home":"4323455"}]},
    {"name":"John Doe","address":[{"home":"Laguna"},{"work":"Ivory"}],"emails":[{"email":"[email protected]"},{"email":"[email protected]"}],"phones":[{"work":"5435435"},{"work":"8678678"}]}
]}

all_data = []
for row in dct['info']:
    all_data.append({
            'name': row['name'],
            'address': ' | '.join('{}: {}'.format(k, v) for a in row['address'] for k, v in a.items()),
            'phones': [v for p in row['phones'] for v in p.values()]
        })

df = pd.DataFrame(all_data).explode('phones')
print(df)

Prints:

       name                               address   phones
0   LYDIANA  home: San Francisco | work: Carolina  1234567
0   LYDIANA  home: San Francisco | work: Carolina  4323455
1  John Doe            home: Laguna | work: Ivory  5435435
1  John Doe            home: Laguna | work: Ivory  8678678

Collectives™ on Stack Overflow

Load Json Nested to Dataframe Pandas

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related