Pandas read_csv does not separate values after comma

Question

I am trying to load some .csv data in the Jupyter notebook but for some reason, it does not separate my data but puts everything in a single column.

import pandas as pd import numpy as np import matplotlib.pyplot as plt df = pd.read_csv(r'C:\Users\leonm\Documents\Fontys\Semester 4\GriefBot_PracticeChallenge\DummyDataGriefbot.csv') df.head()

My csv data

In this picture there is the data I am using.

And now I do not understand why my values all come in single column and are not separated where the comas are. I have also tried both spe=',' and sep=';' but they do not change anything.

This is what I am getting I would really appreciate your help.

Looks like a duplicate of this question. Good first question though, the code and output is always appreciated. — Dbercules
– Dbercules, Commented Feb 26, 2021 at 14:56
Post the first few lines of your CSV as text and not as an image. I'm guessing each list is quoted. Even your Excel reads it in as one column. — user5386938
– user5386938, Commented Feb 26, 2021 at 15:10
"list" should have been "line". I'm guessing each line is quoted. — user5386938
– user5386938, Commented Feb 26, 2021 at 15:16
Please edit your question to include your CSV file data as text. Use a text editor to open it (not-Excel) — Martin Evans
– Martin Evans, Commented Feb 26, 2021 at 15:44

Davis · Accepted Answer · 2021-02-26 15:42:13Z

If that's how your data looks in a CSV reader like Excel, then each row likely looks like one big string in a text editor.

"ID,PERSON,DATE" "1,A. Molina,1593147221" "2,A. Moran, 16456" "3,Action Marquez,15436"

You could of course do "text to columns" within Excel and resave your file, or if you have many of these files, you can use the Pandas split function.

df[df.columns[0].split(',')] = df.iloc[:,0].str.split(',', expand=True) # ^ split header by comma ^ ^ create list split by comma, and expand # | each list entry into a new column # | select first column of data df.head() > ID,PERSON,DATE ID PERSON DATE > 0 1,A. Molina,1593147221 1 A. Molina 1593147221 > 1 2,A. Moran, 16456 2 A. Moran 16456 > 2 3,Action Marquez,15436 3 Action Marquez 15436

You can then use pd.drop to drop that first column if you have no use for it

df.drop(df.columns[0], axis=1, inplace=True)

Thank you very much I was able to clean my data with text to columns in excel. My data looked exactly like you showed, where every row has 2 semi-columns around it.

Collectives™ on Stack Overflow

Pandas read_csv does not separate values after comma

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related