I'm trying to unnest congressional data here: https://theunitedstates.io/congress-legislators/legislators-historical.json
Sample structure:
{
"id": {
"bioguide": "B000226",
"govtrack": 401222,
"icpsr": 507,
"wikipedia": "Richard Bassett (politician)",
"wikidata": "Q518823",
"google_entity_id": "kg:/m/02pz46"
},
"name": {
"first": "Richard",
"last": "Bassett"
},
"bio": {
"birthday": "1745-04-02",
"gender": "M"
},
"terms": [
{
"type": "sen",
"start": "1789-03-04",
"end": "1793-03-03",
"state": "DE",
"class": 2,
"party": "Anti-Administration"
}
]
}
If I just use json_normalize(data), the "terms" don't unnest.
If I try to unnest the terms specifically, like json_normalize(data, 'terms', 'name'), then whatever else I include (here the names) stays in dict format with {u'last': u'Bassett', u'first': u'Richard'} as the row entry.
Full current code, if you want to run it:
import json
import urllib
import pandas as pd
from pandas.io.json import json_normalize
# load data
url = "https://theunitedstates.io/congress-legislators/legislators-historical.json"
json_url = urllib.urlopen(url)
data = json.loads(json_url.read())
# parse
congress_names = json_normalize(data, record_path='terms',meta='name')