1

I have been trying to format a nested json file to a pandas dataframe but i may have missing something,

How can extract the timeseries onto a pandas dataframe? I have been struggling trying to extract all the numbering but if succesful I ended with some of metadata in a dataaframe

Please help!

{
"Meta Data": {
    "1. Information": "Intraday (60min) prices and volumes",
    "2. Symbol": "BHP.AX",
    "3. Last Refreshed": "2018-02-09 00:00:00",
    "4. Interval": "60min",
    "5. Output Size": "Compact",
    "6. Time Zone": "US/Eastern"
},
"Time Series (60min)": {
    "2018-02-09 00:00:00": {
        "1. open": "29.1100",
        "2. high": "29.1950",
        "3. low": "29.1000",
        "4. close": "29.1300",
        "5. volume": "788213"
    },
    "2018-02-08 23:00:00": {
        "1. open": "29.0000",
        "2. high": "29.2000",
        "3. low": "29.0000",
        "4. close": "29.1100",
        "5. volume": "768704"
    },
    "2018-02-08 22:00:00": {
        "1. open": "29.1000",
        "2. high": "29.1000",
        "3. low": "28.9600",
        "4. close": "29.0000",
        "5. volume": "830235"
    },
    "2018-02-08 21:00:00": {
        "1. open": "29.0850",
        "2. high": "29.2250",
        "3. low": "29.0750",
        "4. close": "29.1050",
        "5. volume": "803142"
    },
    "2018-02-08 20:00:00": {
        "1. open": "28.9200",
        "2. high": "29.1500",
        "3. low": "28.8900",
        "4. close": "29.0900",
        "5. volume": "1231131"
    }
  }
}

any Ideas ?

0

2 Answers 2

2

You can use pd.DataFrame.from_dict, specifying an orient.

data = json.loads(json_data)
df = pd.DataFrame.from_dict(data['Time Series (60min)'], orient='index')

You can also cleanup your column headers if needed.

df.columns = df.columns.str.split('. ').str[1] # an optional step

df

                    open     high      low    close   volume
2018-02-08 20:00:00  28.9200  29.1500  28.8900  29.0900  1231131
2018-02-08 21:00:00  29.0850  29.2250  29.0750  29.1050   803142
2018-02-08 22:00:00  29.1000  29.1000  28.9600  29.0000   830235
2018-02-08 23:00:00  29.0000  29.2000  29.0000  29.1100   768704
2018-02-09 00:00:00  29.1100  29.1950  29.1000  29.1300   788213
Sign up to request clarification or add additional context in comments.

Comments

0

If you need only the time series, just tell Pandas!

js = '''Your_JSON_String'''
jsdata = json.loads(js)
jsts = jsdata["Time Series (60min)"]
ts = pd.DataFrame(jsts).T
#                     1. open  2. high   3. low 4. close 5. volume
# 2018-02-08 20:00:00  28.9200  29.1500  28.8900  29.0900   1231131
# 2018-02-08 21:00:00  29.0850  29.2250  29.0750  29.1050    803142
# 2018-02-08 22:00:00  29.1000  29.1000  28.9600  29.0000    830235
# 2018-02-08 23:00:00  29.0000  29.2000  29.0000  29.1100    768704
# 2018-02-09 00:00:00  29.1100  29.1950  29.1000  29.1300    788213

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.