I have a json data which can be represented as the tree structure with each node has four attributes: name,id,child,parentid(pid) (for leaf node it has only three attributes: id,pid,name).
{'child': [{'id': '','child':[{'id': '','child':['name':'','id':'','pid':''], 'name': '', 'pid':''}], 'name': '', 'pid': ''}],'name':'','pid':'','id':''}
I want to convert it to a dataframe with three columns like:
id, pid, name
1 .., ..., ....
2 .., ..., ....
With the data from all layers in three attributes (id,pid,name)
I have tried pandas.read_json with the default parameters but it seems that it cannot iterate the whole layers and the output is just like:
id, pid, name, child
1 .., ..., ...., {'id':'','pid': '','name': '', 'child':[{...}]}
2 .., ..., ...., {'id':'','pid': '','name': '', 'child':[{...}]}
I am wondering whether there are some easy methods to solve this problem with or without pandas.
json_normalize()function or, depending on the complexity of your data, have a look at theflattenlibrary (blog post).json_normalize()not work for me (maybe I set the wrong parameter) andflattenjust returns too many columns.