4

I am using Python's requests library to get a huge JSON response.

Normally, when I do data = resp.json() it takes around 5 seconds.

Then I tried ujson as data = ujson.loads(resp.text), which took around 2 seconds.

Is there a way in which I can get a generator-like object from response? I know we have the streaming facility available, but I guess that will give me data in chunks, whereas I need data as per element as I may then iterate it over for loop.

And while all this is to reduce the time further, is the above method even possible? Or is there any other way in which this can be achieved (I am open to any other library as well)?

Thank you!

1 Answer 1

0

The json-stream module has direct support for streaming a JSON document from a requests response, with which you can simply iterate over items of the nested dicts and/or lists as they are generated from the stream.

The example below streams a ~211MB JSON document of the world's COVID-19 statistics:

import requests
import json_stream.requests

json_url = 'https://covid.ourworldindata.org/data/owid-covid-data.json'

with requests.get(json_url, stream=True) as response:
    data = json_stream.requests.load(response)
    for name, record in data.items():
        print('Region:', name)
        for key, value in record.items():
            if key == 'data':
                for entry in value:
                    for k, v in entry.items():
                        print(k, v)
            else:
                print(key, value)

Demo: https://replit.com/@blhsing1/DeliriousLustrousSubweb

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.