2

Using pyjq i am able to parse values from json file. I need to format the output bit more so this can be exported to csv.

import json
import csv
import pyjq

emp_data = open('example.json', 'r')
emp_data_parsed = json.loads(emp_data.read())
emp = pyjq.all ('.base[].base[].uid, .base[].base[].name', emp_data_parsed)
print emp

The output I am getting

[u'2da21174-0af8-4b5b-b02e-2957a24d70e1', u'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba', u'4ecf6450-7307-466c-bf19-663ba2fbaf69', None, u'Tommy', u'Sam',

Expecting output as below so that can be written to a csv file.

uid,name
'2da21174-0af8-4b5b-b02e-2957a24d70e1','None'
'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba','Tommy'
'4ecf6450-7307-466c-bf19-663ba2fbaf69','Sam'

Following is the sample.json file

example.json
{
    "base": [
        { 
            "base": [
                {
                    "item-number": 1, 
                    "type": "access-item", 
                    "uid": "2da21174-0af8-4b5b-b02e-2957a24d70e1",  
                    "usage": { 
                        "last-date": {
                            "iso-8601": "2018-03-19T03:58-0500", 
                        }, 
                    }, 

                    "item-number": 2, 
                    "name": "Tommy",
                    "type": "access-item", 
                    "uid": "fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba", 

                    "item-number": 3, 
                    "name": "Sam",
                    "type": "access-item", 
                    "uid": "4ecf6450-7307-466c-bf19-663ba2fbaf69", 
                    "usage": { 
                        "last-date": {
                            "iso-8601": "2018-03-21T07:21-0500", 
                        }, 
                    },
                }
            ], 
        }
    ], 
}

I am not sure otherthan pyjq, there is an way of doing this. If so please let me know.

2
  • Relevant convert-nested-json-to-csv Commented Sep 25, 2018 at 12:43
  • @stovfi, no this is to fetch only required values from each section and display or redirect to csv. The other post is to convert the whole json to csv. Commented Sep 25, 2018 at 13:01

2 Answers 2

2

Question: I need to format the output bit more so this can be exported to csv.

Can't test with pyjp, guess from the Project description, try:

pyjq.all('.base[].base[] | {"uid": .uid, "item-number":.item-number}', emp_data_parsed)

Loop your JSON like this:

for rec in emp_data_parsed['base'][0]['base']:
    print("{}".format(rec))

Output:

{'uid': '2da21174-0af8-4b5b-b02e-2957a24d70e1', 'item-number': 1}, ... (omitted for brevity)
{'uid': 'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba', 'item-number': 2}, ... (omitted for brevity)
{'uid': '4ecf6450-7307-466c-bf19-663ba2fbaf69', 'item-number': 3}, ... (omitted for brevity)

The Output is ready for csv.DictWriter read csv.DictWriter, for example:

import csv

with open('test.csv', 'w') as csv_file:
    fieldnames = ['uid', 'item-number']
    writer = csv.DictWriter(csv_file, fieldnames=fieldnames, extrasaction='ignore')
    writer.writeheader()

    for record in emp_data_parsed['base'][0]['base']:
        writer.writerow(record)

Output:

uid,name
2da21174-0af8-4b5b-b02e-2957a24d70e1,None
fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba,Tommy
4ecf6450-7307-466c-bf19-663ba2fbaf69,Sam
Sign up to request clarification or add additional context in comments.

2 Comments

Absolutely perfect @stovfl. Please just minor edit with open('test.csv', 'w') as csv_file with with open('test.csv', 'w') as csv_file:. This worked like a charm. Thanks a lot.
No i tried with emp = pyjq.all('.base[].base[] | {"uid": .uid, "name": .name]}', emp_data_parsed). And it worked
1

Interesting, I know jq, Python wrapper for that is a good idea.

I use jq for my data processing. And also grep, head etc. :) When I need to work with CSV, I rather write a CSV-to-JSONL (or vice versa) program once and then use it as another tool in the shell pipeline.

# to_csv.py
import csv, json, sys
rows = [json.loads(line) for line in sys.stdin]
all_keys = []
for row in rows:
    for key in row.keys():
        if key not in all_keys:
            all_keys.append(key)
writer = csv.DictWriter(sys.stdout, fieldnames=all_keys, extrasaction='ignore')
writer.writeheader()
for row in rows:
    writer.writerow(row)

Usage (I had to fix the example.json a little bit):

$ cat example.json | jq -c '.base[].base[] | { uid, name }' | python3 to_csv.py
uid,name
2da21174-0af8-4b5b-b02e-2957a24d70e1,
fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba,Tommy
4ecf6450-7307-466c-bf19-663ba2fbaf69,Sam

2 Comments

Yes within python pyjq.all ('.base[].base[].uid, .base[].base[].name') works perfectly. BTW do you know how to cut the date only from the following value "iso-8601": "2018-03-19T03:24-0500"from JSON?
If in Python I would use substring :) row["iso-8601"][:10]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.