0

I am trying to read the .json file in python. Here is my python code:

import pandas as pd

df_idf = pd.read_json('/home/lazzydevs/Data/datajs.json',lines = True)

print("Schema:\n\n",df_idf.dtypes)
print("Number of questions,columns=",df_idf.shape)

I checked my json file also it's also valid file. Here is my .json file:

[{
  "id": "4821394",
  "title": "Serializing a private struct - Can it be done?",
  "body": "\u003cp\u003eI have a public class that contains a private struct. The struct contains properties (mostly string) that I want to serialize. When I attempt to serialize the struct and stream it to disk, using XmlSerializer, I get an error saying only public types can be serialized. I don't need, and don't want, this struct to be public. Is there a way I can serialize it and keep it private?\u003c/p\u003e",
  "answer_count": "1",
  "comment_count": "0",
  "creation_date": "2011-01-27 20:19:13.563 UTC",
  "last_activity_date": "2011-01-27 20:21:37.59 UTC",
  "last_editor_display_name": "",
  "owner_display_name": "",
  "owner_user_id": "163534",
  "post_type_id": "1",
  "score": "0",
  "tags": "c#|serialization|xml-serialization",
  "view_count": "296"
},{
  "id": "3367882",
  "title": "How do I prevent floated-right content from overlapping main content?",
  "body": "\u003cp\u003eI have the following HTML:\u003c/p\u003e\n\n\u003cpre\u003e\u003ccode\u003e\u0026lt;td class='a'\u0026gt;\n  \u0026lt;img src='/images/some_icon.png' alt='Some Icon' /\u0026gt;\n  \u0026lt;span\u0026gt;Some content that's waaaaaaaaay too long to fit in the allotted space, but which can get cut off.\u0026lt;/span\u0026gt;\n\u0026lt;/td\u0026gt;\n\u003c/code\u003e\u003c/pre\u003e\n\n\u003cp\u003eIt should display as follows:\u003c/p\u003e\n\n\u003cpre\u003e\u003ccode\u003e[Some content that's wa [ICON]]\n\u003c/code\u003e\u003c/pre\u003e\n\n\u003cp\u003eI have the following CSS:\u003c/p\u003e\n\n\u003cpre\u003e\u003ccode\u003etd.a span {\n  overflow: hidden;\n  white-space: nowrap;\n  z-index: 1;\n}\n\ntd.a img {\n  display: block;\n  float: right;\n  z-index: 2;\n}\n\u003c/code\u003e\u003c/pre\u003e\n\n\u003cp\u003eWhen I resize the browser to cut off the text, it cuts off at the edge of the \u003ccode\u003e\u0026lt;td\u0026gt;\u003c/code\u003e rather than before the \u003ccode\u003e\u0026lt;img\u0026gt;\u003c/code\u003e, which leaves the \u003ccode\u003e\u0026lt;img\u0026gt;\u003c/code\u003e overlapping the \u003ccode\u003e\u0026lt;span\u0026gt;\u003c/code\u003e content. I've tried various \u003ccode\u003epadding\u003c/code\u003e and \u003ccode\u003emargin\u003c/code\u003es, but nothing seemed to work. Is this possible?\u003c/p\u003e\n\n\u003cp\u003eNB: It's \u003cem\u003every\u003c/em\u003e difficult to add a \u003ccode\u003e\u0026lt;td\u0026gt;\u003c/code\u003e that just contains the \u003ccode\u003e\u0026lt;img\u0026gt;\u003c/code\u003e here. If it were easy, I'd just do that :)\u003c/p\u003e",
  "accepted_answer_id": "3367943",
  "answer_count": "2",
  "comment_count": "2",
  "creation_date": "2010-07-30 00:01:50.9 UTC",
  "favorite_count": "0",
  "last_activity_date": "2012-05-10 14:16:05.143 UTC",
  "last_edit_date": "2012-05-10 14:16:05.143 UTC",
  "last_editor_display_name": "",
  "last_editor_user_id": "44390",
  "owner_display_name": "",
  "owner_user_id": "1190",
  "post_type_id": "1",
  "score": "2",
  "tags": "css|overflow|css-float|crop",
  "view_count": "4121"
}]

Now i am trying to read the json file in python but every time it's showing error:

Traceback (most recent call last):
  File "/home/lazzydevs/Desktop/tfstack.py", line 4, in <module>
    df_idf = pd.read_json('/home/lazzydevs/Data/datajs.json',lines = True)
  File "/home/lazzydevs/.local/lib/python3.7/site-packages/pandas/io/json/_json.py", line 592, in read_json
    result = json_reader.read()
  File "/home/lazzydevs/.local/lib/python3.7/site-packages/pandas/io/json/_json.py", line 715, in read
    obj = self._get_object_parser(self._combine_lines(data.split("\n")))
  File "/home/lazzydevs/.local/lib/python3.7/site-packages/pandas/io/json/_json.py", line 739, in _get_object_parser
    obj = FrameParser(json, **kwargs).parse()
  File "/home/lazzydevs/.local/lib/python3.7/site-packages/pandas/io/json/_json.py", line 849, in parse
    self._parse_no_numpy()
  File "/home/lazzydevs/.local/lib/python3.7/site-packages/pandas/io/json/_json.py", line 1093, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None
ValueError: Expected object or value

I checked so many posts but not working...i don't know what is the problem.

2
  • @luigigi is there any way to read multiple dict? Commented Dec 13, 2019 at 7:01
  • @Vinay i checked in online sites where its showing its valid json file. Commented Dec 13, 2019 at 7:08

1 Answer 1

1

The following piece of code seems to work on my machine.

import pandas as pd

df_idf = pd.read_json('/home/lazzydevs/Data/datajs.json') 
print("Schema:\n\n",df_idf.dtypes)
print("Number of questions,columns=",df_idf.shape)
Sign up to request clarification or add additional context in comments.

1 Comment

yes, same. Probably the ,lines = True was the issue

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.