I want to join two dataframes based on certain condition is spark scala. However the catch is if row in df1 matches any row in df2, it should not try to match same row of df1 with any other row in df2. Below is sample data and outcome I am trying to get.
DF1 -------------------------------- Emp_id | Emp_Name | Address_id 1 | ABC | 1 2 | DEF | 2 3 | PQR | 3 4 | XYZ | 1 DF2 ----------------------- Address_id | City 1 | City_1 1 | City_2 2 | City_3 REST | Some_City Output DF ---------------------------------------- Emp_id | Emp_Name | Address_id | City 1 | ABC | 1 | City_1 2 | DEF | 2 | City_3 3 | PQR | 3 | Some_City 4 | XYZ | 1 | City_1 Note:- REST is like wild card. Any value can be equal to REST.
So in above sample emp_name "ABC" can match with City_1, City_2 or Some_City. Output DF contains only City_1 because it finds it first.