5

I want to create a nested JSON based on this CSV File (it's only a snippet)

    Datum,Position,Herkunft,Entscheidungen insgesamt,Insgesamt_monat,Asylberechtigt,Asylberechtigt monat,Asylberechtigt Prozent,Flüchtling,Flüchtling monat,Flüchting Prozent,Gewährung von subisdiärem Schutz,Gewährung monat,Prozent,Abschiebungsverbot,Abschiebungsverbot monat,Prozent,Unbegrenzte Ablehnungen,Unbegrenzte Ablehnungen monat,Prozent,Ablehnung,Ablehnung monat,Prozent,sonstige Verfahrenserledigungen,,Prozent
    2015-10-01,4,Afghanistan,4540,483,37,1,0.8,1188,139,26.2,234,33,5.2,516,61,11.4,538,63,11.9,29,3,0.6,1998,183,44
    2015-09-01,4,Afghanistan,4057,397,36,8,0.9,1049,127,25.9,201,29,5,455,46,11.2,475,22,11.7,26,3,0.6,1815,162,44.7
    2015-08-01,5,Afghanistan,3660,320,28,1,0.8,922,155,25.2,172,12,4.7,409,43,11.2,453,22,12.4,23,2,0.6,1653,85,45.2
    2015-07-01,6,Afghanistan,3340,429,27,4,0.8,767,84,23,160,28,4.8,366,53,11,431,54,12.9,21,2,0.6,1568,204,46.9
    2015-06-01,6,Afghanistan,2911,639,23,2,0.8,683,184,23.5,132,41,4.5,313,64,10.8,377,74,13,19,3,0.7,1364,271,46.9
    2015-05-01,6,Afghanistan,2272,434,21,0,0.9,499,115,22,91,16,4,249,47,11,303,42,13.3,16,1,0.7,1093,213,48.1
    2015-04-01,6,Afghanistan,1838,462,21,4,1.1,384,75,20.9,75,17,4.1,202,44,11,261,60,14.2,15,4,0.8,880,258,47.9
    2015-03-01,5,Afghanistan,1376,527,17,8,1.2,309,123,22.5,58,42,4.2,158,58,11.5,201,70,14.6,11,1,0.8,622,225,45.2
    2015-02-01,5,Afghanistan,849,431,9,9,1.1,186,81,21.9,16,12,1.9,100,42,11.8,131,65,15.4,10,4,1.2,397,218,46.8
    2015-01-01,5,Afghanistan,418,418,0,0,0,105,105,25.1,4,4,1,58,58,13.9,66,66,15.8,6,6,1.4,179,179,42.8
    2015-10-01,2,Albanien,28011,7107,0,0,0,7,4,0,23,7,0.1,18,1,0.1,864,164,3.1,24688,6250,88.1,2411,681,8.6
    2015-09-01,2,Albanien,20904,7326,0,0,0,3,0,0,16,3,0.1,17,6,0.1,700,153,3.3,18438,6657,88.2,1730,507,8.3
    2015-08-01,2,Albanien,13578,3955,0,0,0,3,0,0,13,0,0.1,11,0,0.1,547,124,4,11781,3630,86.8,1223,201,9
    2015-07-01,3,Albanien,9623,4673,0,0,0,3,0,0,13,2,0.1,11,4,0.1,423,164,4.4,8151,4275,84.7,1022,228,10.6
    2015-06-01,3,Albanien,4950,2099,0,0,0,3,0,0.1,11,8,0.2,7,0,0.1,259,75,5.2,3876,1807,78.3,794,209,16
    2015-05-01,3,Albanien,2851,1210,0,0,0,3,0,0.1,3,3,0.1,7,0,0.2,184,52,6.5,2069,1001,72.6,585,154,20.5
    2015-04-01,3,Albanien,1641,799,0,0,0,3,0,0.2,0,0,0,7,1,0.4,132,49,8,1068,581,65.1,431,168,26.3
    2015-03-01,3,Albanien,842,331,0,0,0,3,1,0.4,0,0,0,6,3,0.7,83,12,9.9,487,212,57.8,263,103,31.2
    2015-02-01,4,Albanien,511,233,0,0,0,2,2,0.4,0,0,0,3,3,0.6,71,13,13.9,275,127,53.8,160,88,31.3
    2015-01-01,4,Albanien,278,278,0,0,0,0,0,0,0,0,0,0,0,0,58,58,20.9,148,148,53.2,72,72,25.9
    2015-05-01,10,Bosnien und Herzegowina,1822,227,0,0,0,1,0,0.1,0,0,0,5,2,0.3,12,0,0.7,1538,165,84.4,266,60,14.6
    2015-04-01,9,Bosnien und Herzegowina,1595,206,0,0,0,1,0,0.1,0,0,0,3,0,0.2,12,1,0.8,1373,166,86.1,206,39,12.9
    2015-03-01,9,Bosnien und Herzegowina,1389,341,0,0,0,1,0,0.1,0,0,0,3,1,0.2,11,4,0.8,1207,276,86.9,167,60,12
    2015-02-01,10,Bosnien und Herzegowina,1048,1048,0,0,0,1,1,0.1,0,0,0,2,2,0.2,7,7,0.7,931,931,88.8,107,107,10.2
    2015-10-01,7,Eritrea,5031,1153,16,2,0.3,3979,1070,79.1,326,30,6.5,19,5,0.4,23,2,0.5,5,1,0.1,663,43,13.2
    2015-09-01,8,Eritrea,3878,702,14,1,0.4,2909,519,75,296,148,7.6,14,0,0.4,21,1,0.5,4,1,0.1,620,32,16
    2015-08-01,8,Eritrea,3176,527,13,1,0.4,2390,505,75.3,148,7,4.7,14,2,0.4,20,0,0.6,3,-1,0.1,588,13,18.5
    2015-07-01,8,Eritrea,2649,542,12,2,0.5,1885,492,71.2,141,10,5.3,12,2,0.5,20,5,0.8,4,0,0.2,575,31,21.7
2015-10-01,10,Ungekl√§rt,2987,455,30,1,1,2249,441,75.3,2,0,0.1,2,0,0.1,27,0,0.9,268,33,9,409,-20,13.7
2015-09-01,10,Ungekl√§rt,2532,2147,29,22,1.1,1808,1503,71.4,2,2,0.1,2,2,0.1,27,23,1.1,235,206,9.3,429,389,16.9
2015-01-01,9,Ungekl√§rt,385,385,7,7,1.8,305,305,79.2,0,0,0,0,0,0,4,4,1,29,29,7.5,40,40,10.4

In this form

        "Irak": {}, 
"Mazedonien": {}, 
"Serbien": {}, 
"Ungekl\u221a\u00a7rt": {
    "Insgesamt_monat": [
        "455", 
        "455", 
        "2147", 
        "385"
    ], 
    "Position": [
        "10", 
        "10", 
        "10", 
        "9"
    ], 
    "Entscheidungen insgesamt": [
        "2987", 
        "2987", 
        "2532", 
        "385"
    ], 
    "Datum": [
        "2015-10-01", 
        "2015-10-01", 
        "2015-09-01", 
        "2015-01-01"
    ], 
    "Asylberechtigt": [
        "30", 
        "30", 
        "29", 
        "7"
    ]
}, 
"Albanien": {}, 
"Afghanistan": {}, 
"Kosovo": {}, 
"Summe 1 bis 10": {}, 
"Syrien,Arabische Republik": {}, 
"Eritrea": {}, 
"Bosnien und Herzegowina": {}, 
"Summe gesamt": {}, 
"Pakistan": {}, 
"Nigeria": {}, 
"Somalia": {}

This is my code

import csv
import json

output = {}
country =  { "Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }
lastCountry = ""

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):

        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])

        if output.has_key(row['Herkunft']):
            output[row['Herkunft']].update(country)
        else:
            country.clear()
            country = {"Datum": [row['Datum']], "Position": [row['Position']], "Entscheidungen insgesamt": [row['Entscheidungen insgesamt']], "Insgesamt_monat": [row['Insgesamt_monat']], "Asylberechtigt": [row['Asylberechtigt']] }
            output[row['Herkunft']] = country

    print(json.dumps(output, indent=4))
#    with open('data.txt', 'w') as outfile:

As you can see all countries except one country don't get the data from the csv. Where is the mistake. How can I export the json? I'm actually copying the printed into my text Editor

5
  • 2
    What is the question exactly? Commented Dec 15, 2015 at 14:08
  • Are you allowed to use external modules? I would suggest using pandas which provides a to_json that might be suitable. To export the dict as json you should take a look at Python's json module and basic I/O file handling. Commented Dec 15, 2015 at 14:14
  • @RickyA The question is: Why does the json output only containing one 'Herkunft' ("Ungekl\u221a\u00a7rt") and the other countries don't have any any entries. Commented Dec 15, 2015 at 14:28
  • @albert how can I tell if I'm allowed? I'll try pandas Commented Dec 15, 2015 at 14:29
  • 1
    Based on your working environment, task etc. you might have some limitations for non-standard modules due to several reasons (software security, system maintenance, ...). If you don't have any restrictions give pandas a try which will be powerful when handling bigger amount of data... Commented Dec 15, 2015 at 14:33

2 Answers 2

3

In your code, the problem is at the else clause: What you did:

  1. Reset country -- this remove the row you just updated
  2. Then update output, at this time, your country is already empty

What you need is to:

  1. Append country tooutput`
  2. Reset country
  3. Then update country with the current row

The order is important.

Here is the code:

import csv
import json

output = {}
country = {}

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):
        if not output.has_key(row['Herkunft']):
            output[row['Herkunft']] = country
            country = {"Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }

        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])
        output[row['Herkunft']] = country

    output[row['Herkunft']] = country  # Catch the last country
    print json.dumps(output, indent=4)
Sign up to request clarification or add additional context in comments.

Comments

2

Your indentation is wrong. Now you open the outfile and write to it for every country. So each country overrides the output of the previous. [EDIT]: more problems. You use the country dict in a weird way there. Here is a better version.

import csv
import json

output = {}

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):
        if row['Herkunft'] in output:
            country = output[row['Herkunft']]
        else:
            country = { "Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }
            output[row['Herkunft']] = country
        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])

print(json.dumps(output, indent=4))
with open('data.txt', 'w') as outfile:
    outfile.write(json.dumps(output, indent=4))

4 Comments

thank you. We now can export json. But the problem that some countries except the last one ("Ungekl\u221a\u00a7rt") don't get any entries in the json and just return an empty object but it should be like in "Ungekl\u221a\u00a7rt"
Did you run my updated code? I changed also the handling of the country dict, and by what you describe you see that error.
You could refresh the question before accepting a later one with the same solution...
Sorry, did not see that

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.