1

I have a Python 3.8.5 script that gets a JSON from an API, saves to disk, reads JSON to DF. It works.

df = pd.io.json.read_json('json_file', orient='records')

I want to try IO buffer instead so I don't have to read/write to disk, but I am getting an error. The code is like this:

from io import StringIO
io = StringIO()
json_out = []
# some code to append API results to json_out
json.dump(json_out, io)
df = pd.io.json.read_json(io.getvalue())

On that last line I get the error

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 199, in wrapper
    return func(*args, **kwargs)

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 296, in wrapper
    return func(*args, **kwargs)

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 618, in read_json
    result = json_reader.read()

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 755, in read
    obj = self._get_object_parser(self.data)

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 777, in _get_object_parser
    obj = FrameParser(json, **kwargs).parse()

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 886, in parse
    self._parse_no_numpy()

  File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 1119, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None

ValueError: Trailing data

The JSON is in a list format. So this is not the actual json but it looks like this when I write to disk:

json = [
      {"state": "North Dakota",
        "address": "123 30th st E #206",
        "account": "123"
    },
    {"state": "North Dakota",
        "address": "456 30th st E #206",
        "account": "456"
    }
    ]

Given that it worked in the first case (write/read from disk), I don't know how to troubleshoot. How do I troubleshoot something in the buffer? The actual data is mostly text but has some number fields.

1 Answer 1

1

Don't know what's going wrong for you, this works for me:

import json
import pandas as pd
from io import StringIO

json_out = [
    {"state": "North Dakota",
     "address": "123 30th st E #206",
     "account": "123"
     },
    {"state": "North Dakota",
     "address": "456 30th st E #206",
     "account": "456"
     }
]

io = StringIO()
json.dump(json_out, io)
df = pd.io.json.read_json(io.getvalue())
print(df)

leads me to believe there's something wrong with the code that appends the API data...

However, if you have a list of dictionaries, you don't need the IO step. You can just do:

pd.DataFrame(json_out)

EDIT: I think I remember this error when there was a comma at the end of my json like so:

[
  {
    "hello":"world",
  },
]
Sign up to request clarification or add additional context in comments.

1 Comment

oh I didn't try pd.DataFrame(json_out); that's so simple and seems to work. Thanks for pointing out the obvious that I missed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.