Sorting by column in a CSV and writing to a new CSV file in Python

Question

My code:

import csv import operator first_csv_file = open('/Users/jawadmrahman/Downloads/account-cleanup-3 array/example.csv', 'r+') csv_sort = csv.reader(first_csv_file, delimiter=',') sort = sorted(csv_sort, key=operator.itemgetter(0)) sorted_csv_file = open('new_sorted2.csv', 'w+', newline='') write = csv.writer(sorted_csv_file) for eachline in sort: print (eachline) write.writerows(eachline)

I have an example csv file:

I want to sort by the first column and get the results in this fashion: 1,9 2,17, 3,4 7,10 With the code posted above, this is how I am getting it now:

How do I fix this?

Is , supposed to represent a decimal point in this context? — Ben Grossmann
– Ben Grossmann, Commented Jan 7, 2022 at 18:33
pandas package is the most comprehensive and well supported package for manipulating tabular data such as CSVs. Read, sort, and save should be about 3 lines of code in Pandas. See stackoverflow.com/questions/37787698/… and stackoverflow.com/questions/14365542/… — David Parks
– David Parks, Commented Jan 7, 2022 at 18:35
eachline is itself a list and thus write.writerows(eachline) is producing two rows for every eachline. Try write.writerow(eachline). While you are at it, I encourage you to look at what the with keyword used with open() does for you. It will clean up your code substantially. — JonSG
– JonSG, Commented Jan 7, 2022 at 19:07
Please do not include images of data. Please edit your question and include your input CSV and desired output CSV as text. — Zach Young
– Zach Young, Commented Jan 7, 2022 at 19:12

Zach Young · Accepted Answer · 2022-01-07 21:36:13Z

As JonSG pointed out in the comments to your original post, you're calling writerows() (plural) on a single row, eachline.

Change that last line to write.writerow(eachline) and you'll be good.

Looking at the problem in depth

writerows() expects "a list of a list of values". The outer list contains the rows, the inner list for each row is effectively the cell (column for that row):

sort = [ ['1', '9'], ['2', '17'], ['3', '4'], ['7', '10'], ] writer.writerows(sort)

will produce the sorted CSV with two columns and four rows that you expect (and your print statement shows).

When you call writerows() with a single row:

for eachline in sort: writer.writerows(eachline)

you get some really weird output:

it interprets eachline at the outer list containing a number of rows, which means...
it interprets each item in eachline as a row having individual columns...
and each item in eachline is a Python sequence, string, so writerows() iterates over each character in your string, treating each character as its own column...

['1','9'] is seen as two single-column rows, ['1'] and ['9']:
```
1 9 
```
['2', '17'] is seen as the single-column row ['2'] and the double-column row ['1', '7']:
```
2 1,7 
```

Collectives™ on Stack Overflow

Sorting by column in a CSV and writing to a new CSV file in Python

1 Answer 1

Looking at the problem in depth

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Looking at the problem in depth

1 Comment

Linked

Related