0

I was wondering how I could import a JSON file, and then save that to an ordered CSV file, with header row and the applicable data below.

Here's what the JSON file looks like:

 [
  {
    "firstName": "Nicolas Alexis Julio",
    "lastName": "N'Koulou N'Doubena",
    "nickname": "N. N'Koulou",
    "nationality": "Cameroon",
    "age": 24
  },
  {
    "firstName": "Alexandre Dimitri",
    "lastName": "Song-Billong",
    "nickname": "A. Song",
    "nationality": "Cameroon",
    "age": 26,
    etc. etc. + } ]

Note there are multiple 'keys' (firstName, lastName, nickname, etc.). I would like to create a CSV file with those as the header, then the applicable info beneath in rows, with each row having a player's information.

Here's the script I have so far for Python:

import urllib2
import json
import csv

writefilerows = csv.writer(open('WCData_Rows.csv',"wb+"))


api_key = "xxxx"
url = "http://worldcup.kimonolabs.com/api/players?apikey=" + api_key + "&limit=1000"
json_obj = urllib2.urlopen(url)
readable_json = json.load(json_obj)
list_of_attributes = readable_json[0].keys()

print list_of_attributes


writefilerows.writerow(list_of_attributes)

for x in readable_json:
    writefilerows.writerow(x[list_of_attributes])

But when I run that, I get a "TypeError: unhashable type:'list'" error. I am still learning Python (obviously I suppose). I have looked around online (found this) and can't seem to figure out how to do it without explicitly stating what key I want to print...I don't want to have to list each one individually...

Thank you for any help/ideas! Please let me know if I can clarify or provide more information.

2 Answers 2

1

Your TypeError is occuring because you are trying to index a dictionary, x with a list, list_of_attributes with x[list_of_attributes]. This is not how python works. In this case you are iterating readable_json which appears it will return a dictionary with each iteration. There is no need pull values out of this data in order to write them out.

The DictWriter should give you what your looking for.

import csv
[...]

def encode_dict(d, out_encoding="utf8"):    
    '''Encode dictionary to desired encoding, assumes incoming data in unicode'''
    encoded_d = {}
    for k, v in d.iteritems():
        k = k.encode(out_encoding)
        v = unicode(v).encode(out_encoding)        
        encoded_d[k] = v
    return encoded_d

list_of_attributes = readable_json[0].keys()
# sort fields in desired order
list_of_attributes.sort()

with open('WCData_Rows.csv',"wb+") as csv_out:
    writer = csv.DictWriter(csv_out, fieldnames=list_of_attributes)
    writer.writeheader()
    for data in readable_json:
        writer.writerow(encode_dict(data))

Note: This assumes that each entry in readable_json has the same fields.

Sign up to request clarification or add additional context in comments.

5 Comments

Aha! I think that's it - however, I am getting the error "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in posiiton 6: ordinal not in range(128)". If I'm not mistaken, that's because the player's names have accents in them, and I have to let Python know that? Do I need to use something like .encode('utf-8') somewhere?
do you know the encoding the JSON data is in?
I am not sure, but if you don't mind, you can see the JSON data here: kimonolabs.com/worldcup/explorer
character decoding/encoding is a pain, Joel has a good article on it if your interested. This should work...
I am definitely interested, do you have a link?
1

Maybe pandas could do this - but I newer tried to read JSON

import pandas as pd

df = pd.read_json( ... )

df.to_csv( ... )

pandas.DataFrame.to_csv

pandas.io.json.read_json


EDIT:

data = ''' [
  {
    "firstName": "Nicolas Alexis Julio",
    "lastName": "N'Koulou N'Doubena",
    "nickname": "N. N'Koulou",
    "nationality": "Cameroon",
    "age": 24
  },
  {
    "firstName": "Alexandre Dimitri",
    "lastName": "Song-Billong",
    "nickname": "A. Song",
    "nationality": "Cameroon",
    "age": 26,
  }
]'''

import pandas as pd

df = pd.read_json(data)

print df

df.to_csv('results.csv')

result:

   age             firstName            lastName nationality     nickname
0   24  Nicolas Alexis Julio  N'Koulou N'Doubena    Cameroon  N. N'Koulou
1   26     Alexandre Dimitri        Song-Billong    Cameroon      A. Song

With pandas you can save it in csv, excel, etc (and maybe even directly in database).

And you can do some operations on data in table and show it as graph.

3 Comments

I installed Pandas, but get an error when trying the above, "ImportError: No module named numpy".
pandas has a number of dependencies, it's best to install via pip
I haven't been able to use pip - I get some errors...but that's an issue for another thread altogether. I will keep trying though, thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.