I have a table with the below structure.
ID Entity UserName UserRole UserStatus UpdatedDate Clustered key is defined on columns Entity, UserName, UserRole.
UserStatus for an Entity-UserName-UserRole could be either Active or Inactive.
I could implement the INSERT and UPDATE statements for a dataframe df as follows:
Insert:
values <- paste("('",df$Entity,"','", df$UserName,"','", ,df$UserRole,"','" ,df$UserStatus,"','" , df$UpdatedDate"')", sep="", collapse=",") sqlStr_Insert<- sqlQuery(con, sprintf(paste("Insert into TBL (Entity, UserName,UserRole,UserStatus,UpdatedDate) values ",values))) Update:
sql_string <- "UPDATE TBL SET UserStatus = '%s', UpdatedDate = '%s' WHERE Entity='%s' AND UserName ='%s' AND UserRole = '%s';" sqlStr_update <-sqlQuery(con, paste(sprintf(sql_string, UserStatus, UpdatedDate,df$Entity,df$UserName,df$UserRole), collapse="")) The code works fine but I would like it to check if for every row in the df for a combination of Entity-UserName-UserRole already exists in the TBL. If found, run the UPDATE statement else run the INSERT statement.
For example - SQL Server table data:
ID Entity UserName UserRole UserStatus UpdatedDate --------------------------------------------------- 1 A Ford Analyst Active 1/1/2020 2 B Acura Analyst Active 1/5/2020 3 A Tesla Engineer Inactive 1/6/2020 4 A Kia Analyst Active 1/1/2020 df to be inserted or updated:
Entity UserName UserRole UserStatus UpdatedDate --------------------------------------------------- A Accord Analyst Active 1/10/2020 B Acura Analyst Active 1/10/2020 C BMW Engineer Active 1/10/2020 What should happen:
Only rows 1st and 3rd should get 'INSERTED' to the database TBL. UPDATE should only happen if the 'UserStatus' changes for an existing Entity-UserName-UserRole Combination.
I guess I could run a 'FOR' loop for every row of the dataframe df but I think that would be slow and not an optimum solution.
MERGE. Or two separate statements: anINSERT...WHERE NOT EXISTSand a joinedUPDATE. You may want to look at passing through a Table-Valued Parameter and do it all in one batch. And your current code is wide open to SQL injection and syntax errors, use parameterized queries insteaddbxpackage supports other DBMSes, not sqlserver, perhaps they'll add it at some point). Waldi's two-step approach is what I've been using for a few years.