I've got a SQL table that needs to be updated daily. There may or may not be queries against that table while the update is happening. It's around 500,000 rows.
We have an issue where there is locking conflict when the job to update the table is running at the same time as a query against it.
So I have rewritten the process to update the table as follows:
ALTER procedure [dbo].[Table_Generate] as declare @d datetime = getdate(), @c as int --Check temp tables IF OBJECT_ID('tempdb..#final') IS NOT NULL DROP TABLE #final IF OBJECT_ID('tempdb..#base') IS NOT NULL DROP TABLE #base --Get source data from linked server select ID, Reference, StartDate, EndDate, Description, SomeCode into #base from [LinkedServer].[Database].dbo.[A_View] --Generate row_hash select ID, Reference, StartDate, EndDate, Description, SomeCode, hashbytes('SHA2_256',( select ID, Reference, StartDate, EndDate, Description, SomeCode from #base sub where sub.ID = main.ID for xml raw)) as row_hash into #final from #base main select @c = count(*) from #final if @c >0 begin merge [The_Table_Staging] as target using #final as source on source.ID = target.ID --New rows when not matched by target then insert ( RunDate, ID, Reference, StartDate, EndDate, Description, SomeCode, Row_Hash ) values ( @d, source.ID, source.Reference, source.StartDate, source.EndDate, source.Description, source.SomeCode, source.row_hash) --Existing changed rows when matched and source.row_hash != target.row_hash then update set target.RunDate = @d ,target.Reference = source.Reference ,target.StartDate = source.StartDate ,target.EndDate = source.EndDate ,target.Description = source.Description ,target.SomeCode = source.SomeCode ,target.row_hash = source.row_hash --Deleted rows when not matched by source then delete; --Existing unchanged rows update [The_Table_Staging] set RunDate = @d where RunDate != @d --Commit changes begin transaction exec sp_rename 'The_Table_Live', 'The_Table_Staging_T' exec sp_rename 'The_Table_Staging', 'The_Table_Live' exec sp_rename 'The_Table_Staging_T', 'The_Table_Staging' commit transaction end The idea is to reduce unnecessary row updates, and also to minimise the locking of the live table. I don't really like doing a table rename, but doing an update/insert takes 5-10 seconds, whereas the table rename is virtually instantaneous.
So my questions is: is this approach OK, and/or could I improve it?
Thanks!
Edit to respond to JD:
Hi JD Please no need to apologise - I'm here for constructive criticism.
- I didn't know
MERGEwas problematic. I've never had issues with it myself, but thanks - I may rewrite this part into separate
INSERT/UPDATE/DELETEstatements - I normally agree. The reason I did this is that if I did a
TRUNCATE/INSERTfrom staging at that point, it takes 6-10 seconds whereas thesp_renametakes under a second. So less time locking the table - This doesn't affect the table locking, as it takes the data into the staging table first. I have no choice but to use linked server or SSIS, and in this case I prefer linked server to keep all the SQL in one place
- I always use
XMLinstead ofCONCATbecause otherwise 'a', 'bc' would hash the same as 'ab', 'c' which is not correct
All of the processing up to the populating of the live table from staging is fine - I just want to minimise the amount of time the final live table is locked for,
sp_rename. #4 wasn't as much of a concern for me anyway which is why it's near the bottom of my list but just wanted to make sure you were aware. There are plenty of alternatives to Linked Servers that aren't SSIS though btw, e.g. Replication (multiple options) - this would be my pick for a small number of tables, AlwaysOn Availability Groups, and Log Shipping. #5 I just clarified in my answer that you do need to use a safe column separator to not run the risk of the issue you mentioned, e.g.||.