0

I am comparing two excel files by searching a column value in other file and if that its not present in other file, It will write that whole row to text file.

My excels files are very large, They contain about 2,90,000 rows

Here is what I have tried

 import sys import pandas as pd orig_stdout = sys.stdout f = open('out.txt', 'w') sys.stdout = f` df0 = pd.ExcelFile('1.xlsx').parse('Sheet1') df1 = pd.ExcelFile('v2.xlsx').parse('Sheet1') print (df0[~df0['initial_id'].isin(df1['initial_id'])]) sys.stdout = orig_stdout f.close() print('Done.')' 

compare a value under initial_id column and if its not present in second excel file , print that whole row from first file to output file

Actual Result

21 EXCLAMATION MARK A1 INVERTED EXCLAMATION MARK 22 QUOTATION MARK A2 CENT SIGN 23 NUMBER SIGN A3 POUND SIGN 24 DOLLAR SIGN A4 CURRENCY SIGN 25 PERCENT SIGN A5 YEN SIGN 26 AMPERSAND A6 BROKEN BAR 27 APOSTROPHE A7 SECTION SIGN ... ... ... ... 3159 DIGIT NINE B9 SUPERSCRIPT ONE 3160 COLON BA MASCULINE ORDINAL INDICATOR 3161 SEMICOLON BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 3162 LESS-THAN SIGN BC VULGAR FRACTION ONE QUARTER 3163 EQUALS SIGN BD VULGAR FRACTION ONE HALF 

Expected Result

Missing lines after 27 should also be written to file. If It consumes RAM to store, Part files will also work

2

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.