0

I am trying to load some .csv data in the Jupyter notebook but for some reason, it does not separate my data but puts everything in a single column.

import pandas as pd import numpy as np import matplotlib.pyplot as plt df = pd.read_csv(r'C:\Users\leonm\Documents\Fontys\Semester 4\GriefBot_PracticeChallenge\DummyDataGriefbot.csv') df.head() 

My csv data enter image description here

In this picture there is the data I am using.

And now I do not understand why my values all come in single column and are not separated where the comas are. I have also tried both spe=',' and sep=';' but they do not change anything.

This is what I am getting enter image description here I would really appreciate your help.

5
  • Looks like a duplicate of this question. Good first question though, the code and output is always appreciated. Commented Feb 26, 2021 at 14:56
  • 1
    Post the first few lines of your CSV as text and not as an image. I'm guessing each list is quoted. Even your Excel reads it in as one column. Commented Feb 26, 2021 at 15:10
  • "list" should have been "line". I'm guessing each line is quoted. Commented Feb 26, 2021 at 15:16
  • I think you need to pass header parameter Commented Feb 26, 2021 at 15:18
  • Please edit your question to include your CSV file data as text. Use a text editor to open it (not-Excel) Commented Feb 26, 2021 at 15:44

1 Answer 1

2

If that's how your data looks in a CSV reader like Excel, then each row likely looks like one big string in a text editor.

"ID,PERSON,DATE" "1,A. Molina,1593147221" "2,A. Moran, 16456" "3,Action Marquez,15436" 

You could of course do "text to columns" within Excel and resave your file, or if you have many of these files, you can use the Pandas split function.

df[df.columns[0].split(',')] = df.iloc[:,0].str.split(',', expand=True) # ^ split header by comma ^ ^ create list split by comma, and expand # | each list entry into a new column # | select first column of data df.head() > ID,PERSON,DATE ID PERSON DATE > 0 1,A. Molina,1593147221 1 A. Molina 1593147221 > 1 2,A. Moran, 16456 2 A. Moran 16456 > 2 3,Action Marquez,15436 3 Action Marquez 15436 

You can then use pd.drop to drop that first column if you have no use for it

df.drop(df.columns[0], axis=1, inplace=True) 
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much I was able to clean my data with text to columns in excel. My data looked exactly like you showed, where every row has 2 semi-columns around it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.