0

I want to read csv files in a for loop using pandas. I have put the names of the files in a list. After each iteration each file has to be appended to result. Using the folowing code I can only append one file:

import pandas as pd

files = ['fileA.csv' , 'fileB.csv']
result = None

for files in files: 
    df1 = pd.read_csv(files)
    df1['JourneyID'] = 'Journey2'
    df1.set_index( 'JourneyID', inplace=True)
    df1b = df1.head(15)
    if result is None:
       result = df1b
    else:
        result.append(df1b)

 result.head(30)

Any help please?

6
  • 1
    Are you sure above code is correctly indented? It looks like your if/else should be inside the for loop Commented Feb 6, 2018 at 14:30
  • It is inside the for loop, you re right - typo mistake Commented Feb 6, 2018 at 14:33
  • Can you provide sample input and output? Commented Feb 6, 2018 at 14:35
  • They are files that they have info about the journey, eg vehicle, weather conditions, time of the journey. The files have the same columns name and format. If I dont use for loop then it works. It seems that it doesnt do the append correclty. I m not quite sure if the if/else statement together with the result is None is being used correclty Commented Feb 6, 2018 at 14:46
  • df.append returns a new DataFrame - it doesn't update the frame inplace... It's not ideal as you probably want to handle this a different way (depending on what you need/constraints), but you can use result = result.append(df1b) in the else... to bind result to be the actual new dataframe so it keeps the previous and new elements each loop... Commented Feb 6, 2018 at 14:50

1 Answer 1

2

The Problem is in your use of the .append() method. result is a DataFrame for which the append method behaves differently than for let's say a python list. Whereas the .append() for a python list appends the object in place, the DataFrame.append() method returns a new object. So you need to write

result = result.append(df1b)

For further Info, refer to the docs:

DataFrame.append(other, ignore_index=False, verify_integrity=False)
Append rows of other to the end of this frame, returning a new object. Columns not in this frame are added as new columns.

Sign up to request clarification or add additional context in comments.

3 Comments

yes it worked. I want to ask one more please. Every file should ideally have different index value, eg file1 should have as index value Journey1, and file2 should have Journey2. Does python have this kind of thing? Like macros?
if I understand you correctly, then you can achieve this by using for i,file in enumerate(files) at the top of your for loop and then df1['JourneyID'] = 'Journey' + str(i)
yes exactly this is what I was asking. I need to find it because I m getting weird results but thanks a lot!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.