5

I'm completely new to python. And I need a little help to be able to filter my JSON.

json = { 
    "selection":[ 
         {
          "person_id":105894,
          "position_id":1,
          "label":"Work",
          "description":"A description",
          "startDate":"2017-07-16T19:20:30+01:00",
          "stopDate":"2017-07-16T20:20:30+01:00"
          },
          {
         "person_id":945123,
         "position_id":null,
         "label":"Illness",
         "description":"A description",
         "startDate":"2017-07-17T19:20:30+01:00",
         "stopDate":"2017-07-17T20:20:30+01:00"
         }
       ]
     }

Concretely what I'm trying to do is to transform my JSON (here above) into a Dataframe to be able to use the query methods on it, like:

selected_person_id = 105894
query_person_id = json[(json['person_id'] == selected_person_id)]
or
json.query('person_id <= 105894')

The columns must be:

cols = ['person_id', 'position_id', 'label', 'description', 'startDate', 'stopDate']

How can I do it ?

1

2 Answers 2

13

Use:

df = pd.DataFrame(json['selection'])
print (df)
     description    label  person_id  position_id                  startDate  \
0  A description     Work     105894          1.0  2017-07-16T19:20:30+01:00   
1  A description  Illness     945123          NaN  2017-07-17T19:20:30+01:00   

                    stopDate  
0  2017-07-16T20:20:30+01:00  
1  2017-07-17T20:20:30+01:00  

EDIT:

import json

with open('file.json') as data_file:    
    json = json.load(data_file)
Sign up to request clarification or add additional context in comments.

1 Comment

@Jezrael, where are you giving the path of the json?
3

for more complicated examples where a flattening of the structure is neeeded use json_normalize:

>>> data = [{'state': 'Florida',
...          'shortname': 'FL',
...          'info': {
...               'governor': 'Rick Scott'
...          },
...          'counties': [{'name': 'Dade', 'population': 12345},
...                      {'name': 'Broward', 'population': 40000},
...                      {'name': 'Palm Beach', 'population': 60000}]},
...         {'state': 'Ohio',
...          'shortname': 'OH',
...          'info': {
...               'governor': 'John Kasich'
...          },
...          'counties': [{'name': 'Summit', 'population': 1234},
...                       {'name': 'Cuyahoga', 'population': 1337}]}]
>>> from pandas.io.json import json_normalize
>>> result = json_normalize(data, 'counties', ['state', 'shortname',
...                                           ['info', 'governor']])
>>> result
         name  population info.governor    state shortname
0        Dade       12345    Rick Scott  Florida        FL
1     Broward       40000    Rick Scott  Florida        FL
2  Palm Beach       60000    Rick Scott  Florida        FL
3      Summit        1234   John Kasich     Ohio        OH
4    Cuyahoga        1337   John Kasich     Ohio        OH

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.