3

I have a column in my pandas dataframe as a list and when I write the file to csv, it is removing commas inside the list.

code to replicate

import numpy as np def to_vector(probs, num_classes): vec = np.zeros(num_classes) for i in probs: vec[i] = 1 return vec import pandas as pd l1 = [[[1,5]],[[2,4]]] num = 10 a = pd.DataFrame(l1, columns=['dep']) a['Y_dept'] = a["dep"].apply(lambda x: to_vector(x, num)) a.to_csv('a_temp.csv', index=False) 

But when I read the same file, the commas inside the Y_dept column are missing

b = pd.read_csv('a_temp.csv') b.head() dep Y_dept 0 [1, 5] [0. 1. 0. 0. 0. 1. 0. 0. 0. 0.] 1 [2, 4] [0. 0. 1. 0. 1. 0. 0. 0. 0. 0.] 

Expected Output:

 dep Y_dept 0 [1, 5] [0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ... 1 [2, 4] [0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ... 

quoting=csv.QUOTE_ALL is not working. version: pandas==0.25.3

1 Answer 1

4

If you convert the numpy array to list then you will find the desired result. By default, the numpy array wont be dispalyed using commas. The representation of the data inside computer does not use or need commas, they are simply there for display.

import numpy as np import pandas as pd def to_vector(probs, num_classes): vec = np.zeros(num_classes) for i in probs: vec[i] = 1 return list(vec) l1 = [[[1,5]],[[2,4]]] num = 10 a = pd.DataFrame(l1, columns=['dep']) a['Y_dept'] = a["dep"].apply(lambda x: to_vector(x, num)) a.to_csv('a_temp.csv', index=False) 
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. The issue is solved. But the size of the file when saved as list is almost 3-4 times more than when saved as an np array. Though my issue is solved and I can get on with it, it would be helpful if there is a way to save it as np array and still have commas when we reload it. Thanks !!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.