16

I'm trying to read a JSON file into a Pandas dataframe, in the following:

def read_JSON_into_dataframe( file_name ):
    with sys.stdin if file_name is None else open( file_name, "r", encoding='utf8', errors='ignore' ) as reader:
        df = pd.read_json( reader )
        print( df.describe(), file = sys.stderr )
        return df

However, I'm getting an error, for which to bottom stack frame is:

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\json\json.py in _parse_no_numpy(self)
    869         if orient == "columns":
    870             self.obj = DataFrame(
--> 871                 loads(json, precise_float=self.precise_float), dtype=None)
    872         elif orient == "split":
    873             decoded = {str(k): v for k, v in compat.iteritems(

ValueError: Trailing data

What does "trailing data" refer to? If it refers to some point in the JSON file, is there something I can do to figure out where that is and what's wrong with it?

1

2 Answers 2

36

df = pd.read_json ("filename.json", lines = True)

Sign up to request clarification or add additional context in comments.

3 Comments

You should provide more information about this solution.
for my case i was trying to read a Json Per line file and df = pd.read_json ("filename.json", lines = True) solved the issue
Setting lines=True reads the file as a json object per line, see documentation for pandas.read_json.
2

I made such experiment:

  • Took a properly formatted JSON file.
  • Opened it with a text editor and added " xxxx" after the final "}".
  • Attempted to read it, calling data = json.load(...).

The full error message was:

JSONDecodeError: Extra data: line 112 column 3 (char 6124)

So as you can see, you have precisely indicated in which row / column there was found this extra text.

Take a look at this place of your input file. Probably it is corrupted in some way, e.g. some "{" char was deleted.

To find the source of problem you can even use Notepad++. Note that if you place the cursor either before of after a "{" then this char and also the closing "}" are displayed in red. The same pertains to "[" and "]".

So this way you can locate matching opening / closing braces or brackets and find out what is missing.

Of course, usage of json.load is not likely to read your file as a DataFrame, but at least it precisely indicates the place where the problem occurred. After you find the source of error and correct it, use your program again.

1 Comment

Using a JSON validator (jsonlint.com) I found that the JSON was, indeed, broken several ways, but they were easily fixed in emacs. Once I did that, the read_json works.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.