How to covert multiple .txt files into .csv file in Python

Question

I'm trying to covert multiple text files into a single .csv file using Python. My current code is this:

import pandas
import glob

#Collects the files names of all .txt files in a given directory.
file_names = glob.glob("./*.txt")

#[Middle Step] Merges the text files into a single file titled 'output_file'.
with open('output_file.txt', 'w') as out_file:
    for i in file_names:
        with open(i) as in_file:
            for j in in_file:
                out_file.write(j)

#Reading the merged file and creating dataframe.
data = pandas.read_csv("output_file.txt", delimiter = '/')
  
#Store dataframe into csv file.
data.to_csv("convert_sample.csv", index = None)

So as you can see, I'm reading from all the files and merging them into a single .txt file. Then I convert it into a single .csv file. Is there a way to accomplish this without the middle step? Is it necessary to concatenate all my .txt files into a single .txt to convert it to .csv, or is there a way to directly convert multiple .txt files to a single .csv?

Thank you very much.

You might want to label your "middle step" with a comment. I don't see a problem with your code, as it does everything you said you needed. — Ben Y
– Ben Y, Commented Jun 11, 2021 at 20:30
Yes, the column names will be known ahead of time, and are the same for all of the text files. There will be between 3 and 5 text files at a time to be converted. — Adam
– Adam, Commented Jun 11, 2021 at 20:37

juanpa.arrivillaga · Accepted Answer · 2021-06-11 20:44:33Z

3

Of course it is possible. And you really don't need to involve pandas here, just use the standard library csv module. If you know the column names ahead of time, the most painless way is to use csv.DictWriter and csv.DictReader objects:

import csv
import glob

column_names = ['a','b','c'] # or whatever


with open("convert_sample.csv", 'w', newline='') as target:
    writer = csv.DictWriter(target, fieldnames=column_names)
    writer.writeheader() # if you want a header
    for path in glob.glob("./*.txt"):
        with open(path, newline='') as source:
            reader = csv.DictReader(source, delimiter='/', fieldnames=column_names)
            writer.writerows(reader)

answered Jun 11, 2021 at 20:44

juanpa.arrivillaga

97.6k14 gold badges141 silver badges190 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Michael Ruth Over a year ago

Yes! Thank you for noting that the stdlib csv module is sufficient for this. It's disturbing how often folks are willing to add pandas as a dependency solely for basic csv processing.

juanpa.arrivillaga Over a year ago

@MichaelRuth yeah, it really drives me up the wall.

thentangler Over a year ago

When i try this, I get a blank row in between rows with values. Would it be something to do with the newline=''?

juanpa.arrivillaga Over a year ago

@thentangler are you omitting that?

thentangler Over a year ago

No I am not. When I open in excel I see alternate blank rows, but when I try printing in python I see hex Unicode nuls like ‘/x00’. How do I decode that?

|

Collectives™ on Stack Overflow

How to covert multiple .txt files into .csv file in Python

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related