0

I have the following code snippet and I get the memory error on the last line rows = list(reader)

for file in fileList:
fileName, fileExtension = os.path.splitext(file)
if fileExtension == ".csv":
    with open(path + '\\' + file, "rU") as f:
        reader = csv.reader(f, delimiter=',', dialect="excel")
        rows = list(reader)

Is there any other approach I can use?

8
  • 1
    How large we talking? Commented Feb 13, 2019 at 18:21
  • 2
    Yes. Don't create a giant list out of your csv file. Just iterate over the reader and it will lazily produce each line. Commented Feb 13, 2019 at 18:21
  • @SuperStew Just under 2 Gb Commented Feb 13, 2019 at 18:26
  • @juanpa.arrivillaga I tried running a for loop on the readerto process every row and even that didn't help. Commented Feb 13, 2019 at 18:28
  • @AdarshRavi i would suggest dumping it in a sqlite db and then just querying whatever you need so you don't have to load the whole thing in at once Commented Feb 13, 2019 at 18:37

1 Answer 1

2

Since you've now stated in the comments that you simply want to fix the formatting of the rows, you definitely don't need all the rows at once. You should iterate through the csv reader one row at a time, fix the formatting of the row, write the row to another csv file, and then move on to the next row:

with open(path + '\\' + file, "rU") as f, open(path + '\\' + file + '.fixed', "w") as o:
    reader = csv.reader(f, delimiter=',', dialect="excel")
    writer = csv.writer(o, dialect='excel')
    for row in reader:
        # fix the formatting of the row here
        writer.writerow(row)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.