1

I have 2 CSV files that have 3 columns called ‘num,’date’ and ‘tex.

File1

num date tex 20170512 12/05/2017 15:39 1001 20170512 12/05/2017 15:39 1001 20170908 08/09/2017 02:42 1001 20170908 08/09/2017 06:30 1001 

File 2

num date tex 201705332212 12/05/2017 15:39 1001 20170523212 12/05/2017 15:39 100156 2017232320908 08/09/2017 02:42 10012 20170908 08/09/2017 06:30 1001 

desired output

diff.csv

num date tex 201705332212 12/05/2017 15:39 1001 20170523212 12/05/2017 15:39 100156 2017232320908 08/09/2017 02:42 10012 

I want to match both columns ‘num’ and ‘tex’. currently the below code is just checks difference in the whole file and not against column 'num' and 'tex'. Ideally, i want if both columns ‘num’ and ‘tex’ is different I would like it to write it to the out.csv file.

1 Answer 1

1

Use the csv module.

Ex:

import csv with open("file1.csv","rU") as file_0, open("file2.csv","rU") as file_1, open("out.csv", "w") as out_file: file_0 = csv.reader(file_0, delimiter=";") file_1 = csv.reader(file_1, delimiter=";") next(file_0) #Skip Header out_file_writer = csv.writer(out_file, delimiter=";") out_file_writer.writerow(next(file_1)) #Writer Header for k, v in zip(file_0, file_1): if (k[0] != v[0]) or (k[-1] != v[-1]): out_file_writer.writerow(v) #Writer Diff 
Sign up to request clarification or add additional context in comments.

6 Comments

Yes, (k[0] != v[0]) ==> num & (k[-1] != v[-1]) ==> tex...You can test by printing k[0], v[0] and k[-1], v[-1]
open("out.csv", "wb") as out_file: ?
i used your code of 'open("out.csv", "wb") as out_file: and got the following error out_file_writer.writerow(next(file_1)) #Writer Header peError: a bytes-like object is required, not 'str'
python3 use --> open("out.csv", "w", newline='') as out_file:
Looks like some of the columns are empty...check for emptiness first ...
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.