JSON To Pandas Dataframe with incomplete JSON Properties

Question

Let's say I have a JSON as such:

[
    {
        name: "user1",
        age: 12,
        category: "young",
    },
    {
        name: "user2",
        category: "old",
    },
    {
        name: "user3",
        age: 23,
    }
]

As we can see user1 has the most complete properties which are name, age, category while user2 only has name, category and user3 only has name, age. How can I convert this to a dataframe where the expected result is as such:

id	name	age	category
1	user1	12	young
2	user2	null	old
3	user3	23	null

Hence leaving the empty property as null.

Note that every user can have their JSON property in different position. For example user4 might have properties in the order of name, age, category while user5 might have properties in the order of age, name, category

jezrael · Accepted Answer · 2022-11-11 06:59:39Z

3

If convert json to list of dictionaries pandas add missing values for missing categories:

import json

with open('file.json') as f:    
    data = json.load(f)  

df = pd.DataFrame(data)
df.insert(0, 'id', range(1, len(df)+1))
print (df) 
   id   name   age category
0   1  user1  12.0    young
1   2  user2   NaN      old
2   3  user3  23.0      NaN

answered Nov 11, 2022 at 6:59

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

JSON To Pandas Dataframe with incomplete JSON Properties

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related