I have tens of thousands rows of json snippets like this in a pandas series df["json"]
[{
'IDs': [{
'lotId': '1',
'Id': '123456'
}],
'date': '2009-04-17',
'bidsCount': 2,
}, {
'IDs': [{
'lotId': '2',
'Id': '123456'
}],
'date': '2009-04-17',
'bidsCount': 4,
}, {
'IDs': [{
'lotId': '3',
'Id': '123456'
}],
'date': '2009-04-17',
'bidsCount': 8,
}]
Sample of the original file:
{"type": "OPEN","title": "rainbow","json": [{"IDs": [{"lotId": "1","Id": "123456"}],"date": "2009-04-17","bidsCount": 2,}, {"IDs": [{"lotId": "2","Id": "123456"}],"date": "2009-04-17","bidsCount": 4,}, {"IDs": [{"lotId": "3","Id": "123456"}],"date": "2009-04-17","bidsCount": 8,}]}
{"type": "CLOSED","title": "clouds","json": [{"IDs": [{"lotId": "1","Id": "23345"}],"date": "2009-05-17","bidsCount": 2,}, {"IDs": [{"lotId": "2","Id": "23345"}],"date": "2009-05-17","bidsCount": 4,}, {"IDs": [{"lotId": "3","Id": "23345"}],"date": "2009-05-17","bidsCount": 8,}]}
df = pd.read_json("file.json", lines=True)
I am trying to make them into a data frame, something like
Id lotId bidsCount date
123456 1 2 2009-04-17
123456 2 4 2009-04-17
123456 3 8 2009-04-17
by using
json_normalize(df["json"])
However I get
AttributeError: 'list' object has no attribute 'values'
I guess the json snippet is seen as a list, however I can not figure out how to make it work otherwise. Help appreciated!
dffirst?jsonscolumn a string?result = json_normalize(data, 'IDs', ['date', 'bidsCount'])like this to get your desired result. I did same in my answer, don't know why people like to downvote. hope this helpspd.read_json("file.json", lines=True). Thejsoncolumn is one of the files nested parts, not a string. I can try to recreate the file, as the data is confidential if that would help.