2

I couldn't find the exact code on the platform which is why posting it for suggestions.

I have multiple CSV files (about 100) with the same data formats and header names.

 ,Mean,SD
1,96.432,13.899
2,96.432,13.899
3,96.432,13.899
4,96.432,13.899
5,96.432,13.899

I want to append all files column-wise so that I have them in one file. Also, the header of each data should be the file name so that I can follow which data belongs to which file. For example, above mean, sd--> another row of the file name.

Please guide me, as I am new to Python.

Thank you and regards, Khan.

5
  • 3
    can you provide an example of the expected output? (for example for 2 files, "file1.csv" and "file2.csv") Commented Nov 29, 2021 at 9:35
  • Thank you for your comment. I want the output as below. BT - N - 015 BT - N - 013 BT - N - 012 ,Mean,SD ,Mean,SD ,Mean,SD 1,96.432,13.899 1,107.068,20.890 1,105.122,31.229 2,96.432,13.899 2,107.068,20.890 2,105.122,31.229 3,96.432,13.899 3,107.068,20.890 3,105.122,31.229 4,96.432,13.899 4,107.068,20.890 4,105.122,31.229 5,96.432,13.899 5,107.068,20.890 5,105.122,31.229 6,96.432,13.899 6,107.068,20.890 6,105.122,31.229 7,96.432,13.899 7,107.068,20.890 7,105.122,31.229 8,96.432,13.899 8,107.068,20.890 8,105.122,31.229 The first cell shows the file name Commented Nov 29, 2021 at 9:39
  • I'm afraid I can't understand your specs re the output file, could you please edit your question, possibly adding an example of the intended output? Commented Nov 29, 2021 at 9:39
  • @Khan please edit your question and provide the output as text Commented Nov 29, 2021 at 9:40
  • First csv ( BT - N - 015) has data: ,Mean,SD 1,96.432,13.899 2,96.432,13.899 3,96.432,13.899 4,96.432,13.899 5,96.432,13.899 6,96.432,13.899 7,96.432,13.899 8,96.432,13.899 9,96.432,13.899 10,96.432,13.899 11,96.432,13.899 and so on... Commented Nov 29, 2021 at 9:42

3 Answers 3

2

The question was vague about formatting, so this may vary from the desired output.

filenames = [...]
dfs = []
for f in filenames:
    newdf = pd.read_csv(f)
    newdf.rename(columns={'Mean': 'Mean ' + f, 'SD': 'SD ' + f})
    dfs.append(newdf)
df = pd.concat(dfs)
Sign up to request clarification or add additional context in comments.

1 Comment

This won't perform a column-wise concatenation
0

You can use pandas to read and concatenate the files, together with glob and a dictionary comprehension:

from glob import glob
import pandas as pd

files = glob('/tmp/*.csv') # change the location/pattern accordingly
# if you have a list of files, use: files=['file1.csv', 'file2.csv'...]

df = pd.concat({fname.rsplit('/')[-1]: pd.read_csv(fname, index_col=0)
                for fname in files}, axis=1)

output:

>>> print(df)
  file1.csv         file2.csv        
       Mean      SD      Mean      SD
                                     
1    96.432  13.899    96.432  13.899
2    96.432  13.899    96.432  13.899
3    96.432  13.899    96.432  13.899
4    96.432  13.899    96.432  13.899
5    96.432  13.899    96.432  13.899

Saving to new file:

df.to_csv('concatenated_file.csv')

output:

,file1.csv,file1.csv,file2.csv,file2.csv
,Mean,SD,Mean,SD
 ,,,,
1,96.432,13.899,96.432,13.899
2,96.432,13.899,96.432,13.899
3,96.432,13.899,96.432,13.899
4,96.432,13.899,96.432,13.899
5,96.432,13.899,96.432,13.899

14 Comments

Thank you, but I am getting error 'ValueError: No objects to concatenate'
I guess you used files = glob('/tmp/*.csv') but you probably don't have the files in /tmp, change the code to use your folder
I have changed it to '/home/user/khan/Desktop/check/*.csv'
OK, and does it work as you expected? Note that if will read all the csv files in your check folder, and use the full name as header (I'll update the answer to keep only the file name)
No. It gives the error as I mentioned 'no objects to concatenate'. Thank you for changing the full name as the header but why it is giving me this error? Yes, I want to combine all csv files in the check folder.
|
0

you can use pandas to work with

In [3]: import pandas  

                                                                                                                           

In [4]: import pandas as pd                                                                                                                           

In [13]: ls                                                                                                                                           
abc1.csv  abc.csv

In [14]: df = pd.read_csv('abc.csv')                                                                                                                  

In [15]: df1 = pd.read_csv('abc1.csv')                                                                                                                

In [16]: df                                                                                                                                           
Out[16]: 
        Mean      SD
0  1  96.432  13.899
1  2  96.432  13.899


In [16]: df                                                                                                                                           
Out[16]: 
        Mean      SD
0  1  96.432  13.899
1  2  96.432  13.899

In [17]: df1                                                                                                                                          
Out[17]: 
        Mean      SD
0  3  96.432  13.899
1  4  96.432  13.899
2  5  96.432  13.899

In [18]: df.append(df1)                                                                                                                               
Out[18]: 
        Mean      SD
0  1  96.432  13.899
1  2  96.432  13.899
0  3  96.432  13.899
1  4  96.432  13.899
2  5  96.432  13.899

In [19]: ds = df.append(df1)                                                                                                                          

In [20]: ds                                                                                                                                           
Out[20]: 
        Mean      SD
0  1  96.432  13.899
1  2  96.432  13.899
0  3  96.432  13.899
1  4  96.432  13.899
2  5  96.432  13.899



In [21]: ds.to_csv('file1.csv')  


In [23]: ls                                                                                                                                           
abc1.csv  abc.csv  file1.csv

To deal with multiple files

In [82]: import  pandas as pd                                                                                                                         

In [83]: import os, glob                                                                                                                              

In [84]: s = glob.glob(os.path.join(os.getcwd(),'*.csv'))                                                                                             

In [85]: s                                                                                                                                            
Out[85]: 
['/home/thinkpad/Desktop/stackoverflow/abc1.csv',
 '/home/thinkpad/Desktop/stackoverflow/abc.csv']


In [90]: df = pd.DataFrame(columns = ['in','Mean','SD']) 
    ...: for i in s: 
    ...:     df1 = pd.read_csv(i) 
    ...:     print(df1.head()) 
    ...:     df = df.append(df1) 

In [91]: df                                                                                                                                           
Out[91]: 
  in    Mean      SD
0  3  96.432  13.899
1  4  96.432  13.899
2  5  96.432  13.899
0  1  96.432  13.899
1  2  96.432  13.899

3 Comments

There are 100 files ;)
thank you for your suggestion. I have multiple files, 100 or so; how can I rename them as different data frames. Can I use this code for self naming as data frame?
dear sir, I have resolved your problem now you can have look, Now your folder can have 100's or 1000's It will work very smoothly. Thankyou

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.