Trying to write a script that exclude only rows from csv files under a specific directory, that is present in another csv file, and redirect the output to another csv. This something like an exception rule to apply.
Like from below input with considering the exception as below:
inDirectory/input.csv: Id Name Location Data Services Action 10 John IN 1234 mail active 12 Samy GR 5678 phone disable 28 Doug UK 9123 phone active excDirectory/exception.csv: 12 Samy GR 5678 phone disable Wanted to redirect output as below:
outDirectory/output.csv: Id Name Location Data Services Action 10 John IN 1234 mail active 28 Doug UK 9123 phone active All i am able to write as below, which is incomplete and i am looking for a solution that perform the same. Any idea? i am very much new to Python scripting.
import pandas as pd inDir = os.listdir('csv_out_tmp') excFile = pd.read_csv('exclude/exception.csv', sep=',', index_col=0) for csv in inDir: inFile = pd.read_csv('csv_out_tmp/' + csv) diff = set(inFile)^set(excFile) df[diff].to_csv('csv_out/' + csv, index=False) Another way code i am writing as per @neotrinity
inDir = os.listdir('csv_out_tmp') excFile = 'exclude/exception.csv' for csv in inDir: inFile = open('csv_out_tmp/' + csv) excRow = set(open(excFile)) with open('csv_out/' + csv, 'w') as f: for row in open(inFile): if row not in excRow: f.write(row) With the above code the error i am getting as below
for row in open(inFile): TypeError: coercing to Unicode: need string or buffer, file found
set(InDir)^set(excFile)(for those that don't know what I shared as I deleted it before [never used pandas before so didn't think it'd be helpful])