0

My code is:

import pandas as pd


movies = pd.read_csv("movies.csv")

However, when I run the code I get this error:

   File "pandas\_libs\parsers.pyx", line 859, in pandas._libs.parsers.TextReader.read
  File "pandas\_libs\parsers.pyx", line 874, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas\_libs\parsers.pyx", line 951, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 1083, in pandas._libs.parsers.TextReader._convert_column_data
  File "pandas\_libs\parsers.pyx", line 1136, in pandas._libs.parsers.TextReader._convert_tokens
  File "pandas\_libs\parsers.pyx", line 1253, in pandas._libs.parsers.TextReader._convert_with_dtype
  File "pandas\_libs\parsers.pyx", line 1268, in pandas._libs.parsers.TextReader._string_convert
  File "pandas\_libs\parsers.pyx", line 1458, in pandas._libs.parsers._string_box_utf8

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 2: invalid start byte

Does anyone know why my code isn't working and how I could fix it? I'm sorry that I couldn't import my data set, but it's 4,000,000 lines.

This question's answer does not work for me: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 8

1
  • It would be helpful if you say what you tried in the answer you linked and what the result was, so that we don't waste both of our time going through things that you've already tried. Commented Mar 25, 2020 at 23:32

1 Answer 1

1

I figured it out:

import pandas as pd
import codecs


doc = codecs.open('movies.csv','rU','latin1')
movies = pd.read_csv(doc, nrows=1000000)
print(movies["score"].describe())

Apparently my file was in 'latin1'

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.