1

stackoverflow, please help me. i have json file.json like this.

{"info":[
    {"name":"LYDIANA","address":[{"home":"San Francisco"},{"work":"Carolina"}],"emails":[],"phones":[{"work":"1234567"},{"home":"4323455"}]},
    {"name":"John Doe","address":[{"home":"Laguna"},{"work":"Ivory"}],"emails":[{"email":"[email protected]"},{"email":"[email protected]"}],"phones":[{"work":"5435435"},{"work":"8678678"}]}
]}

how to create dataframe pandas like this?

name        address                                 phones
LYDIANA     home: San Francisco | work: Carolina    1234567
LYDIANA     home: San Francisco | work: Carolina    4323455
John Doe    home: Laguna | work: Ivory              5435435
John Doe    home: Laguna | work: Ivory              8678678
6
  • What about emails? Commented Nov 1, 2020 at 9:10
  • it doesn't make much sense to merge home and work address into single string (and repeat it) while keeping phone on separate lines without any indication is it work or home phone Commented Nov 1, 2020 at 9:16
  • @IoaTzimas I don't want to use email in a dataframe Commented Nov 1, 2020 at 9:20
  • @buran it's possible, you can see the answer in this thread my friend Commented Nov 1, 2020 at 9:22
  • I did not say it's not possible, just that it does not make sense. You loose important information about phone and at the same time keep redundant duplicate information about address. Commented Nov 1, 2020 at 9:46

1 Answer 1

1
import pandas as pd

dct = {"info":[
    {"name":"LYDIANA","address":[{"home":"San Francisco"},{"work":"Carolina"}],"emails":[],"phones":[{"work":"1234567"},{"home":"4323455"}]},
    {"name":"John Doe","address":[{"home":"Laguna"},{"work":"Ivory"}],"emails":[{"email":"[email protected]"},{"email":"[email protected]"}],"phones":[{"work":"5435435"},{"work":"8678678"}]}
]}

all_data = []
for row in dct['info']:
    all_data.append({
            'name': row['name'],
            'address': ' | '.join('{}: {}'.format(k, v) for a in row['address'] for k, v in a.items()),
            'phones': [v for p in row['phones'] for v in p.values()]
        })

df = pd.DataFrame(all_data).explode('phones')
print(df)

Prints:

       name                               address   phones
0   LYDIANA  home: San Francisco | work: Carolina  1234567
0   LYDIANA  home: San Francisco | work: Carolina  4323455
1  John Doe            home: Laguna | work: Ivory  5435435
1  John Doe            home: Laguna | work: Ivory  8678678
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.