0

I'm trying to read such a JSON file in Python, to save only two of the values of each response part:

{
  "responseHeader":{
   "status":0,
   "time":2,
   "params":{
     "q":"query",
     "rows":"2",
     "wt":"json"}},
 "response":{"results":2,"start":0,"docs":[
     {
       "name":["Peter"],
       "country":["England"],
       "age":["23"]},
     {
       "name":["Harry"],
       "country":["Wales"],
       "age":["30"]}]
 }}

For example, I want to put the name and the age in a table. I already tried it this way (based on this topic), but it's not working for me.

import json
import pandas as pd

file = open("myfile.json")

data = json.loads(file)

columns = [dct['name', 'age'] for dct in data['response']]
df = pd.DataFrame(data['response'], columns=columns)
print(df)

I also have seen more solutions of reading a JSON file, but that all were solutions of a JSON file with no other header values at the top, like responseHeader in this case. I don't know how to handle that. Anyone who can help me out?

0

3 Answers 3

1
import json
with open("myfile.json") as f:
    columns = [(dic["name"],dic["age"]) for dic in json.load(f)["response"]["docs"]]
    print(columns)

result:

[(['Peter'], ['23']), (['Harry'], ['30'])]
Sign up to request clarification or add additional context in comments.

Comments

1

You can pass the list data["response"]["docs"] to pandas directly as it's a recordset.

df = pd.DataFrame(data["response"]["docs"])`
print(df)

>>>      name    country   age
    0  [Peter]  [England]  [23]
    1  [Harry]    [Wales]  [30]

The data in you DatFrame will be bracketed though as you can see. If you want to remove the brackets you can consider the following:

for column in df.columns:
    df.loc[:, column] = df.loc[:, column].str.get(0)
    if column == 'age':
        df.loc[:, column] = df.loc[:, column].astype(int)

Comments

0
sample = {"responseHeader":{
   "status":0,
   "time":2,
   "params":{
     "q":"query",
     "rows":"2",
     "wt":"json"}},
 "response":{"results":2,"start":0,"docs":[
     {
       "name":["Peter"],
       "country":["England"],
       "age":["23"]},
     {
       "name":["Harry"],
       "country":["Wales"],
       "age":["30"]}]
 }}
data = [(x['name'][0], x['age'][0]) for x in 
        sample['response']['docs']]
df = pd.DataFrame(names, columns=['name', 
        'age'])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.