0

I am trying to parse this json file and I am having trouble. The json looks like this:

    <ListObject list at 0x2161945a860> JSON: {
  "data": [
    {
      "amount": 100,
      "available_on": 1621382400,
      "created": 1621264875,
      "currency": "usd",
      "description": "0123456",
      "exchange_rate": null,
      "fee": 266,
      "fee_details": [
        {
          "amount": 266,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_abvgd1234",
      "net": 9999,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust1",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    },
    {
      "amount": 25984,
      "available_on": 1621382400,
      "created": 1621264866,
      "currency": "usd",
      "description": "0326489",
      "exchange_rate": null,
      "fee": 93,
      "fee_details": [
        {
          "amount": 93,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_65987jihgf4984oihydgrd",
      "net": 9874,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust2",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    },
  ],
  "has_more": true,
  "object": "list",
  "url": "/v1/balance_"
}

I am trying to parse it in python with this script:

import pandas as pd
df = pd.json_normalize(json)
df.head()

but what I am getting is:

enter image description here

What i need is to parse each of these data points in its own column. So i will have 2 row of data with columns for each data points. Something like this:

enter image description here

How do i do this now?

3
  • All it takes is a little preprocessing, converting your dictionary to a list of tuples. Do that BEFORE you suck it into pandas, and it should just flow in. Commented May 18, 2021 at 2:59
  • 1
    If you tell us what the resulting columns should be, perhaps someone will write the code, but it's a simple this-to-that conversion. Commented May 18, 2021 at 3:00
  • @TimRoberts - i just edited my original question and added example of what i am trying to get. Commented May 18, 2021 at 3:10

1 Answer 1

1

All but one of your fields are direct copies from the JSON, so you can just make a list of the fields you can copy, and then do the extra processing for the fee_details.

import json
import pandas as pd

inp =  """{
  "data": [
    {
      "amount": 100,
      "available_on": 1621382400,
      "created": 1621264875,
      "currency": "usd",
      "description": "0123456",
      "exchange_rate": null,
      "fee": 266,
      "fee_details": [
        {
          "amount": 266,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_abvgd1234",
      "net": 9999,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust1",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    },
    {
      "amount": 25984,
      "available_on": 1621382400,
      "created": 1621264866,
      "currency": "usd",
      "description": "0326489",
      "exchange_rate": null,
      "fee": 93,
      "fee_details": [
        {
          "amount": 93,
          "application": null,
          "currency": "usd",
          "description": "processing fees",
          "type": "fee"
        }
      ],
      "id": "txn_65987jihgf4984oihydgrd",
      "net": 9874,
      "object": "balance_transaction",
      "reporting_category": "charge",
      "source": "cust2",
      "sourced_transfers": {
        "data": [],
        "has_more": false,
        "object": "list",
        "total_count": 0,
        "url": "/v1/source"
      },
      "status": "pending",
      "type": "charge"
    }
  ],
  "has_more": true,
  "object": "list",
  "url": "/v1/balance_"
}"""

copies = [
    'id',
    'net',
    'object',
    'reporting_category',
    'source',
    'amount',
    'available_on',
    'created',
    'currency',
    'description',
    'exchange_rate',
    'fee'
]

data = json.loads(inp)
rows = []
for inrow in data['data']:
    outrow = {}
    for copy in copies:
        outrow[copy] = inrow[copy]
    outrow['fee_details'] = inrow['fee_details'][0]['description']
    rows.append(outrow)

df = pd.DataFrame(rows)
print(df)

Output:

timr@tims-gram:~/src$ python x.py
                           id   net               object reporting_category source  amount  ...     created  currency description exchange_rate  fee      fee_details
0               txn_abvgd1234  9999  balance_transaction             charge  cust1     100  ...  1621264875       usd     0123456          None  266  processing fees
1  txn_65987jihgf4984oihydgrd  9874  balance_transaction             charge  cust2   25984  ...  1621264866       usd     0326489          None   93  processing fees

[2 rows x 13 columns]
timr@tims-gram:~/src$ 
Sign up to request clarification or add additional context in comments.

1 Comment

Awesome. This was quick

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.