I have a list of dataframes. Each dataframe was originally numerical data taken from which are all shaped identically with 21 rows and 5 columns. The first column is an index (index 0 to index 20). I want to compute the average (mean) values into a single dataframe. Then I want to export the dataframe to excel.
Here's a simplified version of my existing code:
#look to concatenate the dataframes together all at once
#dataFrameList is the given list of dataFrames
concatenatedDataframes = pd.concat(dataFrameList, axis = 1)
#grouping the dataframes by the index, which is the same across all of the dataframes
groupedByIndex = concatenatedDataframes.groupby(level = 0)
#take the mean
meanDataFrame = groupedByIndex.mean()
# Create a Pandas Excel writer using openpyxl as the engine.
writer = pd.ExcelWriter(filepath, engine='openpyxl')
meanDataFrame.to_excel(writer)
However, when I open the excel file, I see what looks like EVERY dataframe is copied into the sheet and the average/mean values are not shown. A simplified example is shown below (cutting most of the rows and dataframes)
Dataframe 1 Dataframe 2 Dataframe 3
Index Col2 Col3 Col4 Col5 Col2 Col3 Col4 Col5 Col2 Col3 Col4 Col5
0 Data Data Data Data Data Data Data Data Data Data Data Data
1 Data Data Data Data Data Data Data Data Data Data Data Data
2 Data Data Data Data Data Data Data Data Data Data Data Data
....
I'm looking for something more like:
Averaged DF
Index Col2 Col3 Col4
0 Mean Index0,Col2 across DFs Mean Index0,Col3 across DFs Mean Index0,Col4 across DFs
1 Mean Index1,Col2 across DFs Mean Index1,Col3 across DFs Mean Index1,Col4 across DFs
2 Mean Index2,Col2 across DFs Mean Index2,Col3 across DFs Mean Index3,Col4 across DFs
...
I have also already seen this answer: Get the mean across multiple Pandas DataFrames
If possible, I'm looking for a clean solution, not one which would simply involve looping through each dataFrame value by value. Any suggestions?