4

I have the following "stacked JSON" object within R, example1.json:

{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
  "Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
  "Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
  "Code":[{"event1":"B","result":"0"},…]}

These are not comma-separated. The fundamental goal would be to parse certain fields (or all fields) into an R data.frame or data.table:

    Timestamp    Usefulness
 0   20140101      Yes
 1   20140102      No
 2   20140103      No

Normally, I would read in a JSON within R as follows:

library(jsonlite)

jsonfile = "example1.json"
foobar = fromJSON(jsonfile)

This however throws a parsing error:

Error: lexical error: invalid char in json text.
          [{"event1":"A","result":"1"},…]} {"ID":"1A35B","Timestamp"
                     (right here) ------^

This is a similar question to the following, but in R: multiple Json objects in one file extract by python

EDIT: This file format is called a "newline delimited JSON", NDJSON.

4
  • Are there really newlines before "Code" or did you do that for readability? I also assume the ... is you and not the JSON. If they are files with one compact JSON record per-line, they are "ndjson" files and you can use ndjson::stream_in() which is faster than the jsonlite counterpart and always produces a "flat" data frame. Commented May 20, 2018 at 1:43
  • And, if it is that, this is a dup and we need to know that so it can be marked as such. Commented May 20, 2018 at 1:44
  • @hrbrmstr Yes, please mark as a duplicated question. Commented May 20, 2018 at 11:03
  • Similar to: stackoverflow.com/questions/59921946/… Commented Apr 8, 2022 at 0:55

1 Answer 1

3
  1. The three dots ... invalidate your JSON, hence your lexical error.

  2. You can use jsonlite::stream_in() to 'stream in' lines of JSON.


library(jsonlite)

jsonlite::stream_in(file("~/Desktop/examples1.json"))
# opening file input connection.
# Imported 3 records. Simplifying...
# closing file input connection.
#      ID Timestamp Usefulness Code
# 1 12345  20140101        Yes A, 1
# 2 1A35B  20140102         No B, 1
# 3 AA356  20140103         No B, 0

Data

I've cleaned your example data to make it valid JSON and saved it to my desktop as ~/Desktop/examples1.json

{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes","Code":[{"event1":"A","result":"1"}]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No","Code":[{"event1":"B","result":"1"}]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No","Code":[{"event1":"B","result":"0"}]}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.