1

How to merge all csv files in a specific folder using os.

So the code bellow does the work but it concat all the files in the same directory where the script lives.

How do utilized it in different folders?

My code:

import os
import pandas as pd
import numpy as np



def get_df():
    df=pd.DataFrame()
    for file in os.listdir():
        if file.endswith('.csv'):
            aux=pd.read_csv(file, error_bad_lines=False)
            df=df.append(aux)
    return df


df=get_df()


df.to_csv(f"file_name.csv")

I have tried adding folder name, but doesn't find the files.

for file in os.listdir('My_folder_name\'):

8
  • Does it have to be in python? Or would bash work: cat *.csv > all_files.csv Commented Jul 9, 2020 at 23:30
  • @JD Fries, needs to be python because all the code base is in a jupyter notebook, merging the files is the icing on the cake Commented Jul 9, 2020 at 23:32
  • Your script is probably not running in the directory where your files are, try to print out os.getcwd() to see where it is running from, or use a full path in listdir() Commented Jul 9, 2020 at 23:36
  • @JDFrias, exactly. so the code works if the files are in the same directory. So the goal is access files in different directory/folder Commented Jul 9, 2020 at 23:54
  • I see you're using a back-slash in your example, probably a window path problem? Try os.path.join() or pathlib to build the path, or use forward-slash / not the best way though. Commented Jul 10, 2020 at 0:03

2 Answers 2

1

I found at os module documentation a code that changes the directory os.chdir("C:\\Users\\Desktop\\my_folder_name\\my_new_folder_name").

https://www.tutorialsteacher.com/python/os-module

So I just add this line before the loop and now it works!!

def get_df():
    df=pd.DataFrame()
    os.chdir("C:\\Users\\Desktop\\my_folder_name\\my_new_folder_name")
    for file in os.listdir():
        if file.endswith('.csv'):
            aux=pd.read_csv(file, error_bad_lines=False)
            df=df.append(aux)
    return df


df=get_df()

df.to_csv(f"file_name.csv")
Sign up to request clarification or add additional context in comments.

1 Comment

Does this work? os.listdir("C:\\Users\\Desktop\\my_folder_name\\my_new_folder_name")
1

There are various ways to solve it, depending on the type of merge one wants to do.

Considering your specific requirements (Python and os), and assuming one wants to concat the files, the following will do the work (including for files with the same header)

import os

os.system("awk '(NR == 1) || (FNR > 1)' file*.csv > merged.csv")

Where NR and FNR represent the number of the line being processed.

FNR is the current line within each file.

NR == 1 includes the first line of the first file (the header), while (FNR > 1) skips the first line of each subsequent file.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.