5

I love pandas, but I am having real problems with Unicode errors. read_excel() returns the dreaded Unicode error:

import pandas as pd
df=pd.read_excel('tmp.xlsx',encoding='utf-8')
df.describe()

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 259: ordinal not in range(128)

I figured out that the original Excel had   (non-breaking space) at the end of many cells, probably to avoid conversion of long digit strings to float.

One way around this is to strip the cells, but there must be something better.

for col in df.columns:
    df[col]=df[col].str.strip()

I am using anaconda2.2.0 win64, with pandas 0.16

2
  • this worked for me once: df['somecol'].values.astype('unicode') github.com/pydata/pandas/issues/7758 Commented Jun 10, 2015 at 20:21
  • Do yourself a big favour and switch to python3 right away. Encoding problems are all solved in python3. Commented Jun 11, 2015 at 12:39

3 Answers 3

3

Try this method suggested here:

df=pd.read_excel('tmp.xlsx',encoding=sys.getfilesystemencoding())
Sign up to request clarification or add additional context in comments.

Comments

1

Hope this helps someone.

I had this error:

UnicodeDecodeError: 'ascii' codec can't decode byte ....

after reading an Excel File df = pd.read_excel... and trying to assign a new column to the dataframe like this df['new_col'] = 'foo bar'

After closer inspection, I found the problem to be. There were some 'nan' columns in the dataframe due to missing column headers. After dropping the 'nan' columns using the following code, everything else was ok.

df = df.dropna(axis=1,how='all')

Comments

0
df=pd.read_excel(r'paste path')
df.describe()

2 Comments

Might you please edit your answer and explain how your post answers the question, which is about reading an Excel file whose cells contain non-breaking spaces that are encoded in an unexpected manner? Thanks!
Not my answer, but from my limited experience, it's opening the excel file read-only, which may work if your problem is having the file open in Excel when you try this. That's the error I made, and this solution worked for me, but it's apparent from this thread that more than one cause can give rise to similar symptoms.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.