How to compare two CSV files by column and save the differences in csv file using pandas python

Question

I have two csv files like the following

File1

x1 10.00 a1 x2 10.00 a2 x3 11.00 a1 x4 10.50 a2 x5 10.00 a3 x6 12.00 a3

File2

x1 x4 x5

I would like to create a new file that contains

x2 x3 x6

using pandas or python

jezrael · Accepted Answer · 2019-03-31 01:09:46Z

Use Series.isin with ~ for filtering of values not existing in df1[0] - in first column with DataFrame.loc and boolean indexing:

import pandas as pd #create DataFrame from first file df1 = pd.read_csv(file1, sep=";", header=None) print (df1) 0 1 2 0 x1 10.0 a1 1 x2 10.0 a2 2 x3 11.0 a1 3 x4 10.5 a2 4 x5 10.0 a3 5 x6 12.0 a3 #create DataFrame from second file df2 = pd.read_csv(file2, header=None, sep='|') print (df2) 0 0 x1 1 x4 2 x5 s = df1.loc[~df1[0].isin(df2[0]), 0] print (s) 1 x2 2 x3 5 x6 Name: 0, dtype: object #write to file s.to_csv('new.csv', index=False, header=False)

I get an error on tokenizing data C error expected 1 field saw 2
I do import pandas as pd df1 = pd.read_csv('SKUS.csv', sep="\s+", header=None) df2 = pd.read_csv('retails_csv', header=None) s = df1.loc[~df1[0].isin(df2[0]), 0] s.to_csv('update.csv', index=False)

Collectives™ on Stack Overflow

How to compare two CSV files by column and save the differences in csv file using pandas python

1 Answer 1

20 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

20 Comments

Related