-1

How to prevent integers from being converted to floats when converting a data frame to a list?

I have a .csv file with 5 columns of data. The first four columns have no decimal points, while the last column does.

When I import this data into my script using "pd.read_csv", the data imports correctly, with the first 4 numbers as integers and the last as a float, like this:

1,1,10,0,1.0 1,1,11,0,0.6 1,1,12,0,0.0 

BUT I need to convert this data into a list, and when I do it converts all the numbers into floats. I do not want this. The first four values need to be integers.

This is my current code, which, after its is converted to a list, provides a list where all numbers are float:

data_file_name = r'C:\Users\username\Desktop\FileName.csv' data = pd.read_csv(data_file_name) #<This part works and the data types are correct, the first 4 are integers data2 = data.values.tolist() #<here is where everything gets converted to a float, even if it was defined as an int in the df. 

This results in a list with the data formatted like this:

[[1.0, 1.0, 10.0, 0.0, 1.0], [1.0, 1.0, 11.0, 0.0, 0.6], [1.0, 1.0, 12.0, 0.0, 0.0]] 

When I need it to be formatted like this:

[[1, 1, 10, 0, 1.0], [1, 1, 11, 0, 0.6], [1, 1, 12, 0, 0.0]] 

What can I do?

I've tried:

[int(i,10) for i in data]

But this returns this error:

ValueError: invalid literal for int() with base 10: 'Month' 
5
  • Look at the dtype argument in pd.read_csv Commented Sep 27, 2019 at 21:32
  • 1
    @Prune How does that duplicate help with read_csv? Commented Sep 27, 2019 at 21:35
  • The conversion logic is the critical part. Commented Sep 27, 2019 at 21:38
  • I've added in a new step I left out - the csv reading is working fine and the datatypes there are correct, but a decimal is added to the list even for the columns that are defined as ints in the df. Commented Sep 27, 2019 at 22:01
  • I found the answer to my question here, and this question also is asking exactly what I intended to ask: stackoverflow.com/questions/34838378/… Commented Sep 30, 2019 at 19:15

1 Answer 1

1

Use the dtype argument to control the datatypes.

pd.read_csv(data_file_name, dtype={0: "int64", 1: "int64", 2: "int64", 3: "int64", 4: "float64"}) 
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you, but this does not solve my issue. I've updated the question to be more clear. My data imports just fine, with the first 4 numbers importing as integers, and the last as a float. Its once this data is converted into a list that all numbers are converted to float.
Thanks! I found that answer too, and I added it in the comment above. That answered my question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.