How to compare two csv files?

Question

I have a csv file1 which is like

FLAGSTAFF AZ 50244.67 5.02 KA1_Podium_Garage_S FLAGSTAFF AZ 33752.13 3.38 KA1_Podium_Garage_S FLAGSTAFF AZ 11965.5 1.2 KA1_Podium_Garage_S FLAGSTAFF AZ 3966.48 0.4 KA1_Podium_Garage_S SEATTLE WA 12646.9 1.26 KA1_Podium_Garage_S SEATTLE WA 225053.92 22.51 KA1_Podium_Garage_S SEATTLE WA 23974.3 2.4 KA1_Podium_Garage_S SEATTLE WA 7036.4 0.7 KA1_Podium_Garage_S SEATTLE WA 3021.93 0.3 KA1_Podium_Garage_S

I have a csv file 2 which is like ,

Alabama AL 1 Alaska AK 2 Arizona AZ 4 Arkansas AR 5 California CA 6 Colorado CO 8 Connecticut CT 9 SEATTLE WA 53

Now i have to append the third column value from csv file2 to csv file1, by comparing second column

For example it should look like,

AZ code is 4 WA code is 53 wherever AZ,WA is there in my csv file1 , the code should get appended into a column

My output should look like,

FLAGSTAFF AZ 50244.67 5.02 KA1_Podium_Garage_S 4 FLAGSTAFF AZ 33752.13 3.38 KA1_Podium_Garage_S 4 FLAGSTAFF AZ 11965.5 1.2 KA1_Podium_Garage_S 4 FLAGSTAFF AZ 3966.48 0.4 KA1_Podium_Garage_S 4 SEATTLE WA 12646.9 1.26 KA1_Podium_Garage_S 53 SEATTLE WA 225053.92 22.51 KA1_Podium_Garage_S 53 SEATTLE WA 23974.3 2.4 KA1_Podium_Garage_S 53 SEATTLE WA 7036.4 0.7 KA1_Podium_Garage_S 53 SEATTLE WA 3021.93 0.3 KA1_Podium_Garage_S 53

Here is the code which i tried,

with open("/home/sumit/Desktop/CSV_FILE1.csv", "r") as f: first = {rows[1]: rows[0:] for rows in list(csv.reader(f))} # compare second csv and append Code with open("CSVFILE2.csv", "r") as f1: for row in csv.reader(f1): if row[1] in first: first[row[1]].append(row[2]) # convert dict back to list merged = [(k,) + tuple(v) for k, v in first.items()] # write list to output csv with open("output.csv", "w") as f1: csv.writer(f1).writerows(merged)

Output am getting like,

AZ FLAGSTAFF AZ 44230.4 4.42 KA1_Podium_Garage_S 4 WA SEATTLE WA 45329.3 4.53 KA1_Podium_Garage_S 53

a dict can only have one value for each key; if you try to add a key multiple times with different values, it will only take the latest one — R Nar
– R Nar, Commented Dec 6, 2017 at 0:50
I suggest switching your logic: your second csv file seems to only have one instance of each state. make a dict with that csv where the keys: val pairs are the state: numbers then iterate through the first csv and append the values as necessary. — R Nar
– R Nar, Commented Dec 6, 2017 at 0:51

AJNeufeld · Accepted Answer · 2017-12-06 01:45:26Z

This line is creating a dictionary, based on state abbreviation, from your first file. This looks wrong, since a dictionary can only have each key once, and you have multiple rows with "AZ" for example.

first = { rows[1]: rows[0:] for rows in list(csv.reader(f)) }

Instead, you should make your dictionary based on the second file:

with open("CSVFILE2.csv", "r") as f1: code = { row[1]: row[2] for row in csv.reader(f1) }

Then, loop over your first file, and append the appropriate code to each row.

with open("/home/sumit/Desktop/CSV_FILE1.csv", "r") as f: merged = [ row + [code[row[1]]] for row in csv.reader(f) ]

And write out the merged data to the new file.

Explanation of row + [code[row[1]]]:

If row is ["FLAGSTAFF", "AZ", 50244.67, 5.02, "KA1_Podium_Garage_S"], then

row[1] is "AZ",
code[row[1]] is code["AZ"] which is 4,
[code[row[1]]] is the list [4]
and row + [code[row[1]] is list concatenation, producing the desired new row value: ["FLAGSTAFF", "AZ", 50244.67, 5.02, "KA1_Podium_Garage_S", 4]

Collectives™ on Stack Overflow

How to compare two csv files?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related