I am trying to reference a list of expired orders from one spreadsheet(df name = data2), and vlookup them on the new orders spreadsheet (df name = data) to delete all the rows that contain expired orders. Then return a new spreadsheet(df name = results).
I am having trouble trying to mimic what I do in excel vloookup/sort/delete in pandas. Please view psuedo code/steps as code:
- Import simple.xls as dataframe called 'data'
- Import wo.xlsm, sheet name "T" as dataframe called 'data2'
Do a vlookup , using Column "A" in the "data" to be used to as the values to be matched with any of the same values in Column "A" of "data2" (there both just Order Id's)
For all values that exist inside Column A in 'data2' and also exist in Column "A" of the 'data',group ( if necessary) and delete the entire row(there is 26 columns) for each matched Order ID found in Column A of both datasets. To reiterate, deleting the entire row for the matches found in the 'data' file. Save the smaller dataset as results.
import pandas as pd data = pd.read_excel("ors_simple.xlsx", encoding = "ISO-8859-1", dtype=object) data2 = pd.read_excel("wos.xlsm", sheet_name = "T") results = data.merge(data2,on='Work_Order') writer = pd.ExcelWriter('vlookuped.xlsx', engine='xlsxwriter') results.to_excel(writer, sheet_name='Sheet1') writer.save()
DataFramecontains values that you want to be dropped?dataordata2? And do you need to keep thecolumnsfrom thelookup-table or do you just want to use it to filter your orders?