Parsing Multiple CSV (.txt) Files in Python

Question

I am trying to set up a system where a user can log into the web interface and track orders that have been placed. The system will track the orders from their initial confirmation, through production and finally stop before shipping. (As my wife explained it: "Like the Domino's Pizza order tracker, but for business cards.") I am stuck at a point where I need to parse data from an ever-changing directory of comma delimited .txt files. Each order that is placed automatically generates it's own .txt file with all sorts of important information that I will display on the web interface. For example:

H39TZ3.txt:

token,tag,prodcode,qty          #(These are the headers)
,H39TZ3,pchd_4stpff,,100        #(These are the corresponding values for part 1 of the order)
,H39TZ3,pchdn_8ststts,6420-PCNM8ST,100   #(These are values for part 2 of the order)

There are going to be upwards of 300 different .txt files in the directory at any given time and the files will come and go based on their order status (once shipped, the files will be archived) I have read up on code to parse an individual file and import the values into a dictionary, but everything I've found is for a single file. How would I go about writing something like this, only for multiple files?

import csv

d = {}

for row in csv.reader(open('H39TZ3.txt')):
    d['Order %s' % row[1]] = {'tag': row[1], 'prodcode': row[2], 'qty': row[3]}

Thanks!

Kurt Stutsman · Accepted Answer · 2012-10-24 19:06:28Z

4

You can use os.listdir() to list the contents of the directory containing your .txt files. Something like the following should work for you:

for filename in os.listdir("."):
    with open(filename) as csv_file:
        for row in csv.reader(csv_file):
            d['Order %s' % row[1]] = {'tag': row[1], 'prodcode': row[2], 'qty': row[3]}

Note that I added a with statement in there. It will make sure to close the file after you finish processing it so you don't waste/run out of file descriptors. If the directory might contain other files besides those you are interested in, you could add appropriate filtering before the with statement.

answered Oct 24, 2012 at 19:06

Kurt Stutsman

4,05420 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Chinmay Kanchi Over a year ago

I'd actually use glob.glob('*.txt') instead of os.listdir(). It makes filtering files a whole lot easier. Otherwise, there's going to be the odd binary file that crops up in the directory and makes the whole program die.

Dryden Long Over a year ago

Worked great, thank you! Good call on the glob as well Chinmay!

nneonneo · Accepted Answer · 2012-10-24 19:09:28Z

3

I'd like to add that csv.DictReader is probably a better option if you want to read the rows as dictionaries. It will automatically set the keys of the dictionary based on the first row (headers). You'd use it like this:

with open(filename) as csv_file:
    for row in csv.DictReader(csv_file):
        d['Order ' + row['tag']] = row

As dm03514 mentions, though, a database will probably be a better option. sqlite comes with Python (the sqlite3 module), and you can use a variety of tools to inspect and modify the database. It should also be more robust than using individual files.

answered Oct 24, 2012 at 19:09

nneonneo

181k37 gold badges331 silver badges412 bronze badges

1 Comment

Dryden Long Over a year ago

This worked as well, thanks for the tip on sqlite3, I'll have to look into my database options further down the road.

Collectives™ on Stack Overflow

Parsing Multiple CSV (.txt) Files in Python

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related