UnicodeDecodeError when reading CSV File into Dataframe

Question

I am using the code below to read a csv file into a dataframe. However, I get the error pandas.parser.CParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2 and hence I changed pd.read_csv('D:/TRYOUT.csv') to pd.read_csv('D:/TRYOUT.csv', error_bad_lines=False) as suggested here. However, I now get the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 1: invalid continuation byte in the same line.

def ExcelFileReader():
    mergedf = pd.read_csv('D:/TRYOUT.csv', error_bad_lines=False)
    return mergedf

Could you supply an example CSV file which causes a failure? — Plasma
– Plasma, Commented Aug 10, 2015 at 22:26
This question is similar to: UnicodeDecodeError when reading CSV file in Pandas. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem. — Joooeey
– Joooeey, Commented Sep 24 at 11:10

maxymoo · Accepted Answer · 2015-08-10 23:51:28Z

1

If you're on Windows, you probably need to use pd.read_csv(filename, encoding='latin-1')

answered Aug 10, 2015 at 23:51

maxymoo

36.7k12 gold badges97 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Vishal Jethwa · Accepted Answer · 2015-08-11 15:04:44Z

0

I had a similar problem and had to use

utf-8-sig

as the encoding,

The reason i used utf-8-sig is because if you do ever get non-Latin characters it wont be able to deal with it correctly. There are a few ways of getting around the problem, but i guess you can just choose the best that suits your needs.

Hope that helps.

answered Aug 11, 2015 at 15:04

Vishal Jethwa

771 silver badge7 bronze badges

Comments

j__carlson · Accepted Answer · 2021-11-24 06:17:44Z

0

If you would like to exclude the rows providing error and ignore the malformed data then you need to use:

pd.read_csv(file_path, encoding="utf8", error_bad_lines=False, encoding_errors="ignore")

edited Nov 24, 2021 at 6:17

j__carlson

1,3583 gold badges14 silver badges22 bronze badges

answered Nov 23, 2021 at 11:31

anita.baral

212 bronze badges

Collectives™ on Stack Overflow

UnicodeDecodeError when reading CSV File into Dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related