4
import csv import pandas as pd db = input("Enter the dataset name:") table = db+".csv" df = pd.read_csv(table) df = df.sample(frac=1).reset_index(drop=True) with open(table,'rb') as f: data = csv.reader(f) for row in data: rows = row break print(rows) 

I am trying to read all the columns from the csv file.

ERROR: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 15: invalid start byte

1 Answer 1

5

You need to check encoding of your csv file.

For that you can use print(f),

with open('file_name.csv') as f: print(f) 

The output will be:

<_io.TextIOWrapper name='file_name.csv' mode='r' encoding='utf8'> 

Open csv with the encoding as mentioned in the above output,

with open(fname, "rt", encoding="utf8") as f: 

As mentioned in comments, your encoding is cp1252

so,

with open(fname, "rt", encoding="cp1252") as f: ... 

and for .read_csv,

df = pd.read_csv(table, encoding='cp1252') 
Sign up to request clarification or add additional context in comments.

5 Comments

Hello! Thanks for responding. It is showing encoding as "cp1252". Then I placed encoding = 'cp1252' while opening csv, but it didn't work.
@harshavardhan Open like this with open(fname, "rt", encoding="cp1252") as f: If it solved your issue, don't forget to accept.
It is throwing error at line 5 of the code. df = pd.read_csv(table) Traceback (most recent call last): File "stack.py", line 5, in <module> df = pd.read_csv(table)
@harshavardhan use df = pd.read_csv(table, encoding='cp1252')
Thank you! It's Working.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.