I have approximately 150 different workbooks (xlsx) in a folder that I would like to read into a python dataframe for analysis.
Each workbook is set up identically with the same sheet names and column names.
I would need to upload the first sheet of each workbook ("Keywords Rankings") to each DataFrame. For the first worksheet read in, I would want to start on row 11 to maintain the column headers; every worksheet after that I would want to append to my DataFrame starting on row 12.
I am new to Python and have been reading some instructions online but am stuck. From my understanding, I could use the xlrd library to facilitate this.
I've been playing around with the below code but haven't gotten far. 'Keywords Rankings' is the sheet name I want to append.
import pandas as pd
import numpy as np
import glob as glob
all_data = pd.DataFrame()
all_data = pd.ExcelFile("C:\\Users\\John Smith\\Documents\\Analysis\\FPR Nov - Mar 2018\\Dec_1_General.xlsx")
print(all_data.sheet_names)
all_d = all_data.parse('Keywords Rankings')
for f in glob.glob("Users\\John Smith\\Documents\\Analysis\\FPR Nov - Mar 2018\\*.xlsx", recursive=True):
df = pd.read_excel(f)
all_d = all_d.append(df,ignore_index=True)