Parse multiple json value using pyjq, separated by comma

Question

Using pyjq i am able to parse values from json file. I need to format the output bit more so this can be exported to csv.

import json
import csv
import pyjq

emp_data = open('example.json', 'r')
emp_data_parsed = json.loads(emp_data.read())
emp = pyjq.all ('.base[].base[].uid, .base[].base[].name', emp_data_parsed)
print emp

The output I am getting

[u'2da21174-0af8-4b5b-b02e-2957a24d70e1', u'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba', u'4ecf6450-7307-466c-bf19-663ba2fbaf69', None, u'Tommy', u'Sam',

Expecting output as below so that can be written to a csv file.

uid,name
'2da21174-0af8-4b5b-b02e-2957a24d70e1','None'
'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba','Tommy'
'4ecf6450-7307-466c-bf19-663ba2fbaf69','Sam'

Following is the sample.json file

example.json
{
    "base": [
        { 
            "base": [
                {
                    "item-number": 1, 
                    "type": "access-item", 
                    "uid": "2da21174-0af8-4b5b-b02e-2957a24d70e1",  
                    "usage": { 
                        "last-date": {
                            "iso-8601": "2018-03-19T03:58-0500", 
                        }, 
                    }, 

                    "item-number": 2, 
                    "name": "Tommy",
                    "type": "access-item", 
                    "uid": "fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba", 

                    "item-number": 3, 
                    "name": "Sam",
                    "type": "access-item", 
                    "uid": "4ecf6450-7307-466c-bf19-663ba2fbaf69", 
                    "usage": { 
                        "last-date": {
                            "iso-8601": "2018-03-21T07:21-0500", 
                        }, 
                    },
                }
            ], 
        }
    ], 
}

I am not sure otherthan pyjq, there is an way of doing this. If so please let me know.

@stovfi, no this is to fetch only required values from each section and display or redirect to csv. The other post is to convert the whole json to csv. — Rio
– Rio, Commented Sep 25, 2018 at 13:01

stovfl · Accepted Answer · 2018-09-25 14:48:11Z

2

Question: I need to format the output bit more so this can be exported to csv.

Can't test with pyjp, guess from the Project description, try:

pyjq.all('.base[].base[] | {"uid": .uid, "item-number":.item-number}', emp_data_parsed)

Loop your JSON like this:

for rec in emp_data_parsed['base'][0]['base']:
    print("{}".format(rec))

Output:

{'uid': '2da21174-0af8-4b5b-b02e-2957a24d70e1', 'item-number': 1}, ... (omitted for brevity)
{'uid': 'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba', 'item-number': 2}, ... (omitted for brevity)
{'uid': '4ecf6450-7307-466c-bf19-663ba2fbaf69', 'item-number': 3}, ... (omitted for brevity)

The Output is ready for csv.DictWriter read csv.DictWriter, for example:

import csv

with open('test.csv', 'w') as csv_file:
    fieldnames = ['uid', 'item-number']
    writer = csv.DictWriter(csv_file, fieldnames=fieldnames, extrasaction='ignore')
    writer.writeheader()

    for record in emp_data_parsed['base'][0]['base']:
        writer.writerow(record)

Output:

uid,name
2da21174-0af8-4b5b-b02e-2957a24d70e1,None
fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba,Tommy
4ecf6450-7307-466c-bf19-663ba2fbaf69,Sam

edited Sep 25, 2018 at 14:48

answered Sep 25, 2018 at 14:19

stovfl

15.6k7 gold badges26 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Rio Over a year ago

Absolutely perfect @stovfl. Please just minor edit with open('test.csv', 'w') as csv_file with with open('test.csv', 'w') as csv_file:. This worked like a charm. Thanks a lot.

Rio Over a year ago

No i tried with emp = pyjq.all('.base[].base[] | {"uid": .uid, "name": .name]}', emp_data_parsed). And it worked

Messa · Accepted Answer · 2018-09-25 14:38:58Z

1

Interesting, I know jq, Python wrapper for that is a good idea.

I use jq for my data processing. And also grep, head etc. :) When I need to work with CSV, I rather write a CSV-to-JSONL (or vice versa) program once and then use it as another tool in the shell pipeline.

# to_csv.py
import csv, json, sys
rows = [json.loads(line) for line in sys.stdin]
all_keys = []
for row in rows:
    for key in row.keys():
        if key not in all_keys:
            all_keys.append(key)
writer = csv.DictWriter(sys.stdout, fieldnames=all_keys, extrasaction='ignore')
writer.writeheader()
for row in rows:
    writer.writerow(row)

Usage (I had to fix the example.json a little bit):

$ cat example.json | jq -c '.base[].base[] | { uid, name }' | python3 to_csv.py
uid,name
2da21174-0af8-4b5b-b02e-2957a24d70e1,
fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba,Tommy
4ecf6450-7307-466c-bf19-663ba2fbaf69,Sam

answered Sep 25, 2018 at 14:38

Messa

25.4k10 gold badges77 silver badges101 bronze badges

2 Comments

Rio Over a year ago

Yes within python pyjq.all ('.base[].base[].uid, .base[].base[].name') works perfectly. BTW do you know how to cut the date only from the following value "iso-8601": "2018-03-19T03:24-0500"from JSON?

Messa Over a year ago

If in Python I would use substring :) row["iso-8601"][:10]

Collectives™ on Stack Overflow

Parse multiple json value using pyjq, separated by comma

2 Answers 2

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related