0

I have a python script which can take a JSON file and pulls out specific columns I want with the Pandas json_normalize function. But I have a nested JSON set of values within the JSON that I am trying to pull out but cant get the code to properly get those values.

Below is the JSON value. The top tier is "cardEditions" within that tier is "cardDetails". I want to grab some of the displayName and value details from this nested json and put them into the csv with the cardEditions and the editionNo value.

Looking for the output to be in a CSV pipe delimited like the following with the displayValues as the headers from the nested json.

editionNo Name Edition Position
666 Matt Hu 1st Edition Center Field
{
    "cardEditions": [{
        "editionNo": 666,
        "id": 1111,
        "cardDetails": [{
                "valueType": "Text",
                "displayValueType": "Text",
                "displayName": "Name",
                "value": "Matt Hu"
            },
            {
                "valueType": "Text",
                "displayValueType": "Text",
                "displayName": "Edition",
                "value": "1st Edition"
            },
            {
                "valueType": "Text",
                "displayValueType": "Text",
                "displayName": "Position",
                "value": "Center Field"
            }

        ],
        "cardStatus": "NA"
    }]
}
2

2 Answers 2

1
import pandas as pd

d = {
    "cardEditions": [{
        "editionNo": 666,
        "id": 1111,
        "cardDetails": [{
                "valueType": "Text",
                "displayValueType": "Text",
                "displayName": "Name",
                "value": "Matt Hu"
            },
            {
                "valueType": "Text",
                "displayValueType": "Text",
                "displayName": "Edition",
                "value": "1st Edition"
            },
            {
                "valueType": "Text",
                "displayValueType": "Text",
                "displayName": "Position",
                "value": "Center Field"
            }

        ],
        "cardStatus": "NA"
    }]
}

df = pd.DataFrame(columns=['editionNo', 'Name', 'Edition', 'Position'])
for i, edition in enumerate(d['cardEditions']):
    no = edition['editionNo']
    vals = [details['value'] for details in edition['cardDetails']]
    df.loc[i, :] = (no, *vals)
print(df)

prints

index editionNo Name Edition Position
0 666 Matt Hu 1st Edition Center Field
Sign up to request clarification or add additional context in comments.

3 Comments

This works and when I select specific data out of the json. But when I try and load my whole JSON file it seems like I run into some issues cause of the data im using. I expanded your code above to my 10 columns. I can get the code to run with the 10 columns and can get 3 rows to export. But when I try and run my whole file I get. ValueError: cannot copy sequence with size 2 to array axis with dimension 10
So some of my JSON data that I get doesnt have the cardDetails Nested JSON. Which I think is causing the issue. Is there a way to skip records if they dont have the cardDetails nested JSON?
Or if there is a way I can skip records with the missing cardDetails?
0

Here is another solution with json_normalize and pivot.

# where "j" is your json data
out = (
    pd.json_normalize(json.loads(j)['cardEditions'], record_path=['cardDetails'], meta='editionNo')
    .drop(['valueType', 'displayValueType'],axis=1)
    .pivot(index='editionNo', columns='displayName', values='value')
    .rename_axis(columns=None)
    .reset_index()
)
print(out)

Output:

   editionNo      Edition     Name      Position
0        666  1st Edition  Matt Hu  Center Field

If somebody knows how to use json_normalize with record_path of which we don't want all fields, I'd really like to know. I had to drop to columns from the record_path because I don't know how to skip them in the first place.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.