0

Background

I have two tables - one is updated each day with new data (failed deadline) and one is continually added and deleted from dependent on the other (live table).

The way it works is that if the persons allocated work appears in the 'failed deadline' table, then add to the 'live' table. If the persons allocated work is then not on the 'failed deadline' table the next day, remove from the 'live' table.

The two biggest issues here are that whilst the ID number of the work is an easy check, it also has to check to see if the work is still allocated to the same employee, so that if it changes hands it doesn't count against them.

The problem I'm facing is that the delete portion of the code seems to run over 3,000 rows and return about 300 just fine, but this SQL statement is taking over 4 minutes and 30 seconds to execute. What is the most efficient way of writing this statement? My gut instinct is telling me that the two statements are different and combining them is the issue.

DELETE FROM [LiveTable] WHERE [WorkID] NOT IN (SELECT [WorkID] FROM [FailedDeadline]) OR [LiveTable].[EmployeeID] <> (SELECT [EmployeeID] FROM [FailedDeadline] as fd WHERE [LiveTable].[WorkID] = fd.[WorkID]); 
5
  • Can you Paste the Plan? A NOT EXISTS and a NOT IN aren't going to perform too differently, however, a NOT IN does have problems with NULLs, so you probably want the latter. For the second subquery, perhaps a LEFT JOIN might be more performant, but it's unlikely to make a big dent. Commented Mar 8, 2019 at 10:39
  • You could replace the WHERE NOT IN with WHERE NOT EXISTS, but SQL Server probably would already have optimized it under the hood. Beyond that, have a look at this canonical DBA question which discusses ways to improve the speed of a delete. Commented Mar 8, 2019 at 10:39
  • My gut instinct is telling me that those NOT and this OR combined with that <>, is the issue. Your query falls into a full table scan because those operators combined are most of the time evil resulting in indexes not being used. Show the explain plan Commented Mar 8, 2019 at 10:41
  • It might be a tad difficult @Larnu just because the amount of cleanup required to not post business-specific information and schema info might be a bit too much. Is there any specific point I could share? The graphical plan shows me that Filter and Table Scan take up 43% and 32% respectively. Commented Mar 8, 2019 at 10:47
  • @Larnu - A NOT IN certainly can perform much worse than NOT EXISTS - especially if the columns being compared are NULL-able on both sides. Because SQL Server then has to add a load of extra operators to the plan to deal with this that can make things significantly more expensive. See for example sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in (logical reads on SmallerTable 500011 and elapsed time = 825 ms vs logical reads 9 and elapsed time = 228 ms) and another look at it here stackoverflow.com/a/11074428/73226 Commented Mar 8, 2019 at 13:05

2 Answers 2

1

Well I am not sure by how much this will lessen the execution time but rewriting the above query as follows might help.

DELETE LT FROM [LiveTable] LT LEFT JOIN [FailedDeadLine] FD1 ON LT.WorkID=FD1.WorkID LEFT JOIN [FailedDeadline] FD2 ON LT.WorkID=FD2.WorkID WHERE (FD1.WorkID IS NULL OR LT.EmployeeID<>FD2.EmployeeID) 
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for this - was the FDL alias in the WHERE clause supposed to be FD1? This does seem to work for the Employee ID part of the statement (and far quicker!), but it doesn't seem to remove the WorkID's that are just missing from the other table.
Just to add to this - removing the INNER JOIN line and just executing WHERE FD1.WorkID IS NULL does delete the ones not in the table. Changing this to a LEFT JOIN does delete everything in about 3 seconds! Thank you so much @Sanjay! If you can edit the answer I'll accept it!
Is the query posted in Question working properly? If yes then my answer is just refactored version of your query. Could you elaborate more about the removing WorkID's condition?
Thank you - so only the FD1.WorkID IS NULL portion executes when using LEFT JOIN followed by INNER JOIN. I have changed it to LEFT JOIN followed by another LEFT JOIN and it returns both WHERE clauses.
1

I would phrase this as:

DELETE lt FROM [LiveTable] lt WHERE NOT EXISTS (SELECT 1 FROM FailedDeadline fdl WHERE fdl.WorkID = lt.WorkID AND lt.EmployeeID = fdl.EmployeeID ); 

This one condition captures that the record does not exist in the failed table with the same employee id.

Then, you want an index on FailedDeadline(WorkId, EmployeeID).

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.