Pandas cannot read excel data as string

Question

I am trying to read a series of xls files in a loop and create a master dataframe. While all files have same columns, in some files, a column is a string while in others, it is int. I want to read all of it as string to prevent any problems. Pandas read the first file, but all the others show up as Nan,NaT in my dataframe. What did I do wrong?

for f in glob.glob("C:\Consoildated_DailyReports\Hold*.xlsx"):
    df = pd.read_excel(f,sheet_name='Data')
    df = df.astype(str)
    #df.to_html()
    data1 = data1.append(df,ignore_index=True)

data1

Charles Landau · Accepted Answer · 2018-11-09 17:05:23Z

4

pd.read_excel(..., dtype={"col_name": object}) can do it! This is an argument that lets you specify how pandas reads the data type as it reads.

for f in glob.glob("C:\Consoildated_DailyReports\Hold*.xlsx"):
    df = pd.read_excel(f,sheet_name='Data', dtype={"col_name": object})
    df = df.astype(str)
    #df.to_html()
    data1 = data1.append(df,ignore_index=True)

answered Nov 9, 2018 at 17:05

Charles Landau

4,2751 gold badge13 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Victor Over a year ago

Thanks. Still does not work. And when I try to write the final data1 to an excel, I only get the result of reading the first file.

Charles Landau Over a year ago

Is the glob properly populating?

Collectives™ on Stack Overflow

Pandas cannot read excel data as string

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related