0

I'll try to look for help once more, so my base code is ready, in the very beginning, it converts all the negative values to 0, and after that, it does calculate the sum and cumulative values of the csv data:

import csv
from collections import defaultdict, OrderedDict


def convert(data):
    try:
        return int(data)
    except ValueError:
        return 0


with open('MonthData1.csv', 'r') as file1:
        read_file = csv.reader(file1, delimiter=';')
        delheader = next(read_file)
        data = defaultdict(int)
        for line in read_file:
            valuedata = max(0, sum([convert(i) for i in line[1:5]]))
            data[line[0].split()[0]] += valuedata

        for key in OrderedDict(sorted(data.items())):
            print('{};{}'.format(key, data[key]))
        print("")
        previous_values = []
        for key, value in OrderedDict(sorted(data.items())).items():
            print('{};{}'.format(key, value + sum(previous_values)))
            previous_values.append(value)

This code prints:

1.5.2018 245
2.5.2018 105
4.5.2018 87

1.5.2018 245
2.5.2018 350
4.5.2018 437

That's how I want it to print the data. First the sum of each day, and then the cumulative value. My question is, how can I format this data so it can be written to a new csv file with the same format as it prints it? So the new csv file should look like this:
enter image description here

I have tried to do it myself (with dateime), and searched for answers but I just can't find a way. I hope to get a solution this time, I'd appreciate it massively.
The data file as csv: https://files.fm/u/2vjppmgv
Data file in pastebin https://pastebin.com/Tw4aYdPc Hope this can be done with default libraries

9
  • I may not have understood your question perfectly, but it seems that you simply need to change the two occurrences of '{} {}' for '{};{}'. In my test, the resulting CSV file looks exactly like the second image. If this was the issue, then it was not a matter of formatting the date, but of formatting the columns. Commented Nov 22, 2018 at 14:08
  • Yeah, thanks. Do you know how should I write the data to a csv file? I have no idea on that part Commented Nov 22, 2018 at 14:27
  • if you data is in a dataframe called df then simply import pandas as pd df.to_csv("\\path\\output.csv") Commented Nov 22, 2018 at 14:46
  • I have the whole code with the default libraries, do you have ideas how this should be done without external libraries? Commented Nov 22, 2018 at 14:56
  • Pandas is not external lib. Commented Nov 22, 2018 at 15:37

3 Answers 3

2

Writing a CSV is simply a matter of writing values separated by commas (or semi-colons in this case. A CSV is a plain text file (a .txt if you will). You can read it and write using python's open() function if you'd like to.

You could actually get rid of the CSV module if you wish. I included an example of this in the end.

This version uses only the libraries that were available in your original code.

import csv
from collections import defaultdict, OrderedDict

def convert(data):
    try:
        return int(data)
    except ValueError:
        return 0    

file1 = open('Monthdata1.csv', 'r')
file2 = open('result.csv', 'w')

read_file = csv.reader(file1, delimiter=';')
delheader = next(read_file)
data = defaultdict(int)
for line in read_file:
    valuedata = max(0, sum([convert(i) for i in line[1:5]]))
    data[line[0].split()[0]] += valuedata

for key in OrderedDict(sorted(data.items())):
    file2.write('{};{}\n'.format(key, data[key]))
file2.write('\n')
previous_values = []
for key, value in OrderedDict(sorted(data.items())).items():
    file2.write('{};{}\n'.format(key, value + sum(previous_values)))
    previous_values.append(value)
file1.close()
file2.close()

There is a gotcha here, though. As I didn't import the os module (that is a default library) I used the characters \n to end the line. This will work fine under Linux and Mac, but under windows you should use \r\n. To avoid this issue you should import the os module and use os.linesep instead of \n.

import os
(...)
    file2.write('{};{}{}'.format(key, data[key], os.linesep))
(...)
    file2.write('{};{}{}'.format(key, value + sum(previous_values), os.linesep))

As a sidenote this is an example of how you could read your CSV without the need for the CSV module:

   data = [i.split(";") for i in open('MonthData1.csv').read().split('\n')]

If you had a more complex CSV file, especially if it had strings that could have semi-colons within, you'd better go for the CSV module.

The pandas library, mentioned in other answers is a great tool. It will most certainly be able to handle any need you might have to deal with CSV data.

Sign up to request clarification or add additional context in comments.

5 Comments

I can't open the csv file after running your code, it says that the file is being used. I used the windows code, as I'm a windows user
I forgot to close the files. Perhaps that is the issue. You may need to close the software you are using to work with python and open it again to release the file. I updated the code to include the closing method.
Yeah I figured it out too after commenting:D One edit if it's possible, could the first three be in a row after each other, and the cumulative too? Now there is a extra row after each date. If you don't understand me, I'd love to have it the same as the picture in the OP, if this is possible
Sure. I edited the script in the answer for that. It is simply a matter of writing a blank line '\n' or '\r\n' in the file2. That is, file2.write('\r\n')
Omg! I have tried to search for this solution for so long! I am so happy now, thank you, thank you a million times!
1

This code creates a new csv file with the same format as what's printed.

import pandas as pd #added
import csv
from collections import defaultdict, OrderedDict


def convert(data):
    try:
        return int(data)
    except ValueError:
        return 0


keys = [] #added
data_keys = [] #added

with open('MonthData1.csv', 'r') as file1:
        read_file = csv.reader(file1, delimiter=';')
        delheader = next(read_file)
        data = defaultdict(int)
        for line in read_file:
            valuedata = max(0, sum([convert(i) for i in line[1:5]]))
            data[line[0].split()[0]] += valuedata

        for key in OrderedDict(sorted(data.items())):
            print('{} {}'.format(key, data[key]))
            keys.append(key) #added
            data_keys.append(data[key]) #added

        print("")
        keys.append("") #added
        data_keys.append("") #added
        previous_values = []
        for key, value in OrderedDict(sorted(data.items())).items():
            print('{} {}'.format(key, value + sum(previous_values)))
            keys.append(key) #added
            data_keys.append(value + sum(previous_values)) #added
            previous_values.append(value)

df = pd.DataFrame(data_keys,keys) #added
df.to_csv('new_csv_file.csv', header=False) #added

5 Comments

Thanks for the reply, I have a version done with pandas, I'm looking for a solution done with default libraries. Do you think it's possible?
I'm not familiar with what you mean by default libraries. Is numpy included?
With default I mean something that I don't have to install separately. I don't think that numpy is included
It would've been nice if you had specified this in the question. Unfortunately, I can't help.
Sorry, my bad. I upvoted your post, and will mark is as solution of I don't get other comments
0

This is the version that does not use any imports at all

def convert(data):
    try:
         out = int(data)
    except ValueError:
        out = 0
    return out ### try to avoid multiple return statements


with open('Monthdata1.csv', 'rb') as file1:
    lines = file1.readlines()
data = [ [ d.strip() for d in l.split(';')] for l in lines[ 1 : : ] ]
myDict = dict()
for d in data:
    key = d[0].split()[0]
    value = max(0, sum([convert(i) for i in d[1:5]]))
    try:
        myDict[key] += value
    except KeyError:
        myDict[key] = value
s1=""
s2=""
accu = 0
for key in sorted( myDict.keys() ):
    accu += myDict[key]
    s1 += '{} {}\n'.format( key, myDict[key] )
    s2 += '{} {}\n'.format( key, accu )
with open( 'out.txt', 'wb') as fPntr:
    fPntr.write( s1 + "\n" + s2 )

This uses non-ordered dictionaries, though, such that sorted() may result in problems. So you actually might want to use datetime giving, e.g.:

import datetime

with open('Monthdata1.csv', 'rb') as file1:
    lines = file1.readlines()
data = [ [ d.strip() for d in l.split(';')] for l in lines[ 1 : : ] ]
myDict = dict()
for d in data:
    key  = datetime.datetime.strptime( d[0].split()[0], '%d.%m.%Y' )
    value = max(0, sum([convert(i) for i in d[1:5]]))
    try:
        myDict[key] += value
    except KeyError:
        myDict[key] = value
s1=""
s2=""
accu = 0
for key in sorted( myDict.keys() ):
    accu += myDict[key]
    s1 += '{} {}\n'.format( key.strftime('%d.%m.%y'), myDict[key] )
    s2 += '{} {}\n'.format( key.strftime('%d.%m.%y'), accu )
with open( 'out.txt', 'wb') as fPntr:
    fPntr.write( s1 + "\n" + s2 )

Note that I changed to the 2 digit year by using %y instead of %Y in the output. This formatting also adds a 0 to day and month.

1 Comment

If on Windows, look at the os.linesep in ndvo's answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.