1
import csv 
import numpy as np
import matplotlib.pyplot as plt

myfile = open('LoggedData_CalInertialAndMag.csv', 'rt')
reader = csv.reader(myfile)
next(reader)

a = [row[4] for row in reader]
b = [row[5] for row in reader]

So this is the beginning of my code that I was trying to load the whole column from the csv file to an array, and cast them to float later for other uses. However, I got errors so I checked len(a) and len(b) seperately. While len(a) has the length 838, which len(b) should have the same, but it's 0.

Why? and then I changed my code using append, which I feel is a bit more complex. So where might I be wrong?

2
  • It will be a very sharp arrow : ) Commented Apr 10, 2015 at 14:00
  • I think you mean "array" not "arrow". Commented Apr 10, 2015 at 14:01

3 Answers 3

1

The problem is your first list comprehension for list a "consumes" the csv file as it iterates over it, so it's empty for the one for list b. Although not as concise as list comprehensions, I would suggest you create both lists at the same time. Also note the proper way to open a csv file for the csv module depends on the version of Python you're using.

import collections
import csv
import sys

csv_read_args = ({'mode': 'rb'} if sys.version_info[0] < 3 else
                 {'mode': 'r', 'newline': ''})

with open('LoggedData_CalInertialAndMag.csv', **csv_read_args) as myfile:
    reader = csv.reader(myfile)
    next(reader)
    a, b = [], []
    # feed generator expression into a zero-length deque to consume it
    generator = ((a.append(row[4]), b.append(row[5])) for row in reader)
    collections.deque(generator, maxlen=0)

An alternative to doing it this way would be to do a myfile.seek(0) between the two list comprehension statements to "rewind" the file back to the beginning. This would be less efficient because it requires the reader to parse the entire file twice.

Update

Here's another, slightly (8%-10%) faster, alternative:

with open('LoggedData_CalInertialAndMag.csv', **csv_read_args) as myfile:
    reader = csv.reader(myfile)
    next(reader)
    a, b = map(list, zip(*[(row[4], row[5]) for row in reader]))

You may not need the final map(list, ...) depending on whether you require a and b to be lists or not (zip returns a tuple in Python 2 and an iterator in Python 3).

Sign up to request clarification or add additional context in comments.

2 Comments

I have used an alternative which uses append as I said in the post. I didn't use collection but your way is basically the same as what I am doing now.
I used collections.deque in an effort to provide something that was both efficient and concise that would allow appending data from each row to more than one list at time. I got the idea from an itertools module recipe called consume. Since deque is provided in a built-in library, it's very fast.
1

This should get your code working

import csv

import matplotlib.pyplot as plt
import numpy as np

a, b = [], []
with open("LoggedData_CalInertialAndMag.csv", 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        a.append(float(row[4]))
        b.append(float(row[5]))

since you appear to be using numpy and mention reading it into arrays I assume something like numpy.genfromtxt should also work

import numpy as np

filename = "LoggedData_CalInertialAndMag.csv"
a = np.genfromtxt(filename, usecols=[4], delimiter=',', dtype=float)
b = np.genfromtxt(filename, usecols=[5], delimiter=',', dtype=float)

Comments

0

Try adding a call to myfile.seek(0) before your second call https://docs.python.org/2/library/csv.html

read is reading the content of the file, when you try to load b all the file was already loaded, so there's nothing to look in the file :)

But a better approach would be to load both at once and then you can assign them to different arrays without reading the file again (disk I/O operations are quite expensive):

a = [[row[4], row[5]] for row in reader]
b = [r[0] for r in a]
c = [r[1] for r in a]

4 Comments

By seek, you meant run this function after I 've loaded the first time?
Yes, but a better approach would be to load both at once, I'll edit with this alternative.
i actually need them in two different arrays
So, just assign them to 2 different arrays, I'll edit the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.