Pandas - Adding dummy header column in csv

Question

I am trying concat several csv files by customer group using the below code:

files = glob.glob(file_from + "/*.csv") <<-- Path where the csv resides
df_v0 = pd.concat([pd.read_csv(f) for f in files]) <<-- Dataframe that concat all csv files from files mentioned above

The problem is the number of column in the csv varies by customer and they do not have a header file.

I am trying to see if I could add in a dummmy header column with labels such as col_1, col_2 ... depending on the number of columns in that csv.

Could anyone guide as to how could I get this done. Thanks.

Update on trying to search for a specific string in the Dataframe:

Sample Dataframe

col_1,col_2,col_3
fruit,grape,green
fruit,watermelon,red
fruit,orange,orange
fruit,apple,red

Trying to filter out rows having the word red and expect it to return rows 2 and 4.

Tried the below code:

df[~df.apply(lambda x: x.astype(str).str.contains('red')).any(axis=1)]

jezrael · Accepted Answer · 2018-11-02 09:36:50Z

1

Use parameters header=None for default range columns 0, 1, 2 and skiprows=1 if necessary remove original columns names:

df_v0 = pd.concat([pd.read_csv(f, header=None, skiprows=1) for f in files])

If want also change columns names add rename:

dfs = [pd.read_csv(f, header=None, skiprows=1).rename(columns = lambda x: f'col_{x + 1}') 
        for f in files]
df_v0 = pd.concat(dfs)

edited Nov 2, 2018 at 9:36

answered Nov 2, 2018 at 9:07

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

dark horse Over a year ago

one more help. files is a list that has list of filenames stored in it. I have an issue where few filename are written in upper case (eg : FILE1.CSV) and few are in small case (eg: file2.csv).. How could we make them all small case. Could you please assist on that. Thanks..

jezrael Over a year ago

@darkhorse - not sure if understand, files return list of filenames with upper and lower names. Then looping by them and DataFrame are created. If change filenames to lowercase then errors will raise - file not exist. But if realluy need it use pd.read_csv(f.lower(), ...

dark horse Over a year ago

if I got that right, are filenames case - sensitive when read in pandas. For example if file name is FILE1.CSV and if I pass in file1.csv will it fail because they are case-sensitive.

dark horse Over a year ago

got another question. I am trying to search for a specific text from the entire dataframe (df_v0). Need to scan through all rows and columns. I am able to filter by a specific column but not sure how to extend this to the entire Dataframe..

jezrael Over a year ago

@darkhorse - You are really close, only remove ~ like df[df.apply(lambda x: x.astype(str).str.contains('red')).any(axis=1)]

|

Collectives™ on Stack Overflow

Pandas - Adding dummy header column in csv

1 Answer 1

10 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related