0

I have a two column csv file

    row    vote
     1      0
     1      0
     1      1
     2      0
     2      0
     3      1
     3      0

I'm trying to write a python script so that each vote is counted depending on the row number, thus outputting

   row    vote
    1      1
    2      0
    3      1

what I've tried so far with a text file:

from collections import defaultdict

d = defaultdict(int)

with open("data.txt") as f:
    for line in f:
        tokens = [t.strip() for t in line.split(",")]
        try:
            row = int(tokens[1])
            vote = int(tokens[1])
        except ValueError:
            continue
        d[row] += vote
print d

and I'm getting IndexError: list index out of range errors

2
  • could you fix your indentation in code you posted ? Commented May 3, 2015 at 18:05
  • 3
    Shouldn't it be row = int(tokens[0])? Commented May 3, 2015 at 18:09

1 Answer 1

1

As @Adalee mentioned, you probably should have row = int(tokens[0]).

Here is one way to do this:

result = {}

with open("test.csv") as f:
    for line in f:
        tokens = line.split(",")

        row = None
        vote = None

        try:
            row = int(tokens[0])
            vote = int(tokens[1])
        except Exception as e:
            pass

        if row is not None:
            if result.has_key(row):
                result[row] += vote
            else:
                result[row] = vote

print result

And output could be:

{1: 1, 2: 3, 3: 9}

test.csv file:

row,vote
1,0
1,0
1,1
2,2
2,1
3,4
3,2
3,3
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.