Python formatting data to csv file

Question

I'll try to look for help once more, so my base code is ready, in the very beginning, it converts all the negative values to 0, and after that, it does calculate the sum and cumulative values of the csv data:

import csv
from collections import defaultdict, OrderedDict


def convert(data):
    try:
        return int(data)
    except ValueError:
        return 0


with open('MonthData1.csv', 'r') as file1:
        read_file = csv.reader(file1, delimiter=';')
        delheader = next(read_file)
        data = defaultdict(int)
        for line in read_file:
            valuedata = max(0, sum([convert(i) for i in line[1:5]]))
            data[line[0].split()[0]] += valuedata

        for key in OrderedDict(sorted(data.items())):
            print('{};{}'.format(key, data[key]))
        print("")
        previous_values = []
        for key, value in OrderedDict(sorted(data.items())).items():
            print('{};{}'.format(key, value + sum(previous_values)))
            previous_values.append(value)

This code prints:

1.5.2018 245
2.5.2018 105
4.5.2018 87

1.5.2018 245
2.5.2018 350
4.5.2018 437

That's how I want it to print the data. First the sum of each day, and then the cumulative value. My question is, how can I format this data so it can be written to a new csv file with the same format as it prints it? So the new csv file should look like this:

I have tried to do it myself (with dateime), and searched for answers but I just can't find a way. I hope to get a solution this time, I'd appreciate it massively.
The data file as csv: https://files.fm/u/2vjppmgv
Data file in pastebin https://pastebin.com/Tw4aYdPc Hope this can be done with default libraries

I may not have understood your question perfectly, but it seems that you simply need to change the two occurrences of '{} {}' for '{};{}'. In my test, the resulting CSV file looks exactly like the second image. If this was the issue, then it was not a matter of formatting the date, but of formatting the columns. — ndvo
– ndvo, Commented Nov 22, 2018 at 14:08
Yeah, thanks. Do you know how should I write the data to a csv file? I have no idea on that part — Armeija
– Armeija, Commented Nov 22, 2018 at 14:27
if you data is in a dataframe called df then simply import pandas as pd df.to_csv("\\path\\output.csv") — Rahul Agarwal
– Rahul Agarwal, Commented Nov 22, 2018 at 14:46
I have the whole code with the default libraries, do you have ideas how this should be done without external libraries? — Armeija
– Armeija, Commented Nov 22, 2018 at 14:56

ndvo · Accepted Answer · 2018-11-22 19:04:50Z

2

Writing a CSV is simply a matter of writing values separated by commas (or semi-colons in this case. A CSV is a plain text file (a .txt if you will). You can read it and write using python's open() function if you'd like to.

You could actually get rid of the CSV module if you wish. I included an example of this in the end.

This version uses only the libraries that were available in your original code.

import csv
from collections import defaultdict, OrderedDict

def convert(data):
    try:
        return int(data)
    except ValueError:
        return 0    

file1 = open('Monthdata1.csv', 'r')
file2 = open('result.csv', 'w')

read_file = csv.reader(file1, delimiter=';')
delheader = next(read_file)
data = defaultdict(int)
for line in read_file:
    valuedata = max(0, sum([convert(i) for i in line[1:5]]))
    data[line[0].split()[0]] += valuedata

for key in OrderedDict(sorted(data.items())):
    file2.write('{};{}\n'.format(key, data[key]))
file2.write('\n')
previous_values = []
for key, value in OrderedDict(sorted(data.items())).items():
    file2.write('{};{}\n'.format(key, value + sum(previous_values)))
    previous_values.append(value)
file1.close()
file2.close()

There is a gotcha here, though. As I didn't import the os module (that is a default library) I used the characters \n to end the line. This will work fine under Linux and Mac, but under windows you should use \r\n. To avoid this issue you should import the os module and use os.linesep instead of \n.

import os
(...)
    file2.write('{};{}{}'.format(key, data[key], os.linesep))
(...)
    file2.write('{};{}{}'.format(key, value + sum(previous_values), os.linesep))

As a sidenote this is an example of how you could read your CSV without the need for the CSV module:

   data = [i.split(";") for i in open('MonthData1.csv').read().split('\n')]

If you had a more complex CSV file, especially if it had strings that could have semi-colons within, you'd better go for the CSV module.

The pandas library, mentioned in other answers is a great tool. It will most certainly be able to handle any need you might have to deal with CSV data.

edited Nov 22, 2018 at 19:04

answered Nov 22, 2018 at 17:41

ndvo

98912 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Armeija Over a year ago

I can't open the csv file after running your code, it says that the file is being used. I used the windows code, as I'm a windows user

ndvo Over a year ago

I forgot to close the files. Perhaps that is the issue. You may need to close the software you are using to work with python and open it again to release the file. I updated the code to include the closing method.

Armeija Over a year ago

Yeah I figured it out too after commenting:D One edit if it's possible, could the first three be in a row after each other, and the cumulative too? Now there is a extra row after each date. If you don't understand me, I'd love to have it the same as the picture in the OP, if this is possible

ndvo Over a year ago

Sure. I edited the script in the answer for that. It is simply a matter of writing a blank line '\n' or '\r\n' in the file2. That is, file2.write('\r\n')

Armeija Over a year ago

Omg! I have tried to search for this solution for so long! I am so happy now, thank you, thank you a million times!

Cedric Eveleigh · Accepted Answer · 2018-11-22 15:15:45Z

1

This code creates a new csv file with the same format as what's printed.

import pandas as pd #added
import csv
from collections import defaultdict, OrderedDict


def convert(data):
    try:
        return int(data)
    except ValueError:
        return 0


keys = [] #added
data_keys = [] #added

with open('MonthData1.csv', 'r') as file1:
        read_file = csv.reader(file1, delimiter=';')
        delheader = next(read_file)
        data = defaultdict(int)
        for line in read_file:
            valuedata = max(0, sum([convert(i) for i in line[1:5]]))
            data[line[0].split()[0]] += valuedata

        for key in OrderedDict(sorted(data.items())):
            print('{} {}'.format(key, data[key]))
            keys.append(key) #added
            data_keys.append(data[key]) #added

        print("")
        keys.append("") #added
        data_keys.append("") #added
        previous_values = []
        for key, value in OrderedDict(sorted(data.items())).items():
            print('{} {}'.format(key, value + sum(previous_values)))
            keys.append(key) #added
            data_keys.append(value + sum(previous_values)) #added
            previous_values.append(value)

df = pd.DataFrame(data_keys,keys) #added
df.to_csv('new_csv_file.csv', header=False) #added

answered Nov 22, 2018 at 15:15

Cedric Eveleigh

1417 bronze badges

5 Comments

Armeija Over a year ago

Thanks for the reply, I have a version done with pandas, I'm looking for a solution done with default libraries. Do you think it's possible?

Cedric Eveleigh Over a year ago

I'm not familiar with what you mean by default libraries. Is numpy included?

Armeija Over a year ago

With default I mean something that I don't have to install separately. I don't think that numpy is included

Cedric Eveleigh Over a year ago

It would've been nice if you had specified this in the question. Unfortunately, I can't help.

Armeija Over a year ago

Sorry, my bad. I upvoted your post, and will mark is as solution of I don't get other comments

mikuszefski · Accepted Answer · 2018-11-23 09:44:55Z

This is the version that does not use any imports at all

def convert(data):
    try:
         out = int(data)
    except ValueError:
        out = 0
    return out ### try to avoid multiple return statements


with open('Monthdata1.csv', 'rb') as file1:
    lines = file1.readlines()
data = [ [ d.strip() for d in l.split(';')] for l in lines[ 1 : : ] ]
myDict = dict()
for d in data:
    key = d[0].split()[0]
    value = max(0, sum([convert(i) for i in d[1:5]]))
    try:
        myDict[key] += value
    except KeyError:
        myDict[key] = value
s1=""
s2=""
accu = 0
for key in sorted( myDict.keys() ):
    accu += myDict[key]
    s1 += '{} {}\n'.format( key, myDict[key] )
    s2 += '{} {}\n'.format( key, accu )
with open( 'out.txt', 'wb') as fPntr:
    fPntr.write( s1 + "\n" + s2 )

This uses non-ordered dictionaries, though, such that sorted() may result in problems. So you actually might want to use datetime giving, e.g.:

import datetime

with open('Monthdata1.csv', 'rb') as file1:
    lines = file1.readlines()
data = [ [ d.strip() for d in l.split(';')] for l in lines[ 1 : : ] ]
myDict = dict()
for d in data:
    key  = datetime.datetime.strptime( d[0].split()[0], '%d.%m.%Y' )
    value = max(0, sum([convert(i) for i in d[1:5]]))
    try:
        myDict[key] += value
    except KeyError:
        myDict[key] = value
s1=""
s2=""
accu = 0
for key in sorted( myDict.keys() ):
    accu += myDict[key]
    s1 += '{} {}\n'.format( key.strftime('%d.%m.%y'), myDict[key] )
    s2 += '{} {}\n'.format( key.strftime('%d.%m.%y'), accu )
with open( 'out.txt', 'wb') as fPntr:
    fPntr.write( s1 + "\n" + s2 )

Note that I changed to the 2 digit year by using %y instead of %Y in the output. This formatting also adds a 0 to day and month.

Collectives™ on Stack Overflow

Python formatting data to csv file

3 Answers 3

5 Comments

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related