4
import csv
import pandas as pd
db = input("Enter the dataset name:")
table = db+".csv"
df = pd.read_csv(table)
df = df.sample(frac=1).reset_index(drop=True)
with open(table,'rb') as f:
    data = csv.reader(f)
    for row in data:
        rows = row
        break
print(rows)

I am trying to read all the columns from the csv file.

ERROR: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 15: invalid start byte

1 Answer 1

5

You need to check encoding of your csv file.

For that you can use print(f),

with open('file_name.csv') as f:
    print(f)

The output will be:

<_io.TextIOWrapper name='file_name.csv' mode='r' encoding='utf8'>

Open csv with the encoding as mentioned in the above output,

with open(fname, "rt", encoding="utf8") as f:

As mentioned in comments, your encoding is cp1252

so,

with open(fname, "rt", encoding="cp1252") as f:
    ...

and for .read_csv,

df = pd.read_csv(table, encoding='cp1252')
Sign up to request clarification or add additional context in comments.

5 Comments

Hello! Thanks for responding. It is showing encoding as "cp1252". Then I placed encoding = 'cp1252' while opening csv, but it didn't work.
@harshavardhan Open like this with open(fname, "rt", encoding="cp1252") as f: If it solved your issue, don't forget to accept.
It is throwing error at line 5 of the code. df = pd.read_csv(table) Traceback (most recent call last): File "stack.py", line 5, in <module> df = pd.read_csv(table)
@harshavardhan use df = pd.read_csv(table, encoding='cp1252')
Thank you! It's Working.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.