Is there any potential performance problems if I update by deleting and inserting

Question

I'm using SQL Server and I have a table related N-1 to another table and I have a process that updates the information of the N table by removing all the related rows and inserting them again.

I could do a search in the table and check if the row I'm going to insert already exists, but that will require to compare every column and its a lot easier to me just to deleted them all and inserting them back.

This causes the primary key to grow and grow and I'm concerned about the performance in the future.

So, the question is, will this cause a reduction of the queries in the future and it's a lot better to update the rows or this is not a problem at all?

Thanks

EDIT: I'm not deleting all the rows every time, I'm just deleting the related rows, so if I have 3 new rows to insert, I search the current ones, delete them and insert the new ones

What is the table structure and queries? Sometimes an UPDATE gets carried out as a DELETE ... INSERT rather than in place anyway. — Martin Smith
– Martin Smith, Commented Aug 29, 2013 at 10:41
There is an ad table and a prices table. An ad contain several prices, so the prices table is got a foreign key to the ad table. When I update an ad, I delete all the current prices and insert the new ones. This may make the primary key of the prices table to grow a lot and that is my biggest concern — lpaloub
– lpaloub, Commented Aug 29, 2013 at 10:50
do you have maintenance plans on the database? do you regenerate indexes? if so, the PK grow will not affect performance, since the number of rows will be the same. — Luis LL
– Luis LL, Commented Aug 29, 2013 at 10:50

usr · Accepted Answer · 2013-08-29 11:48:08Z

1

When it comes to SQL Servers internal DML primitives, UPDATE is the cheapest DML to perform. DELETE is 2nd. INSERT is the most expensive one.

A combination of insert and delete is never going to be faster. At best, SQL Server executes the UPDATE as INSERT+DELETE internally. In that case you might end up equal. If the update is processed natively, it will be faster though. How much faster or slower depends on the table structure and data, as well as on the query plan.

That performance loss might be ok, though. You can trade productivity for performance.

edited Aug 29, 2013 at 11:48

answered Aug 29, 2013 at 10:38

usr

172k38 gold badges251 silver badges380 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

lpaloub Over a year ago

So, the primary key growing instead of keeping it "lower" won't cause a reduction in the performance in the future? I'm not concerned about the speed of the process, so how is done is not a problem, is how the database stays in the future what matters.

usr Over a year ago

What do you mean by the primary key growing? You mean that the number increases? That does not impact performance at all. Row count, row size, page fullness and fragmentation do.

Gordon Linoff Over a year ago

@usr . . . Can you document your first statement? It is not consistent with my experience or knowledge of how databases work. I could never make such a general statement.

usr Over a year ago

@GordonLinoff I understand your suspicion. My statement is targeted at single-B-tree DML because multi-index DML plans get to complex to make such a general statement. Here is the script that I played with to derive this information: pastebin.com/vSCHm5Qz All DML there targets a single CI with 1M rows that fit easily into memory. All computations are simple integer computations (I want to test the infrastructure, not the expression evaluator).

Gordon Linoff Over a year ago

@usr . . . So you understand, an update can require both an insert and a delete, both on the data pages and in the indexes, and it requires logging both the old data and the new data. It can be the least-performant alternative.

|

Aijaz Chauhan · Accepted Answer · 2013-08-29 10:54:22Z

As now you are deleting all the related rows. it means you are having a bunch of records that is related to another table record.

instead of deleting why don't you update it. you are having primary keys of all that records. it might be a time consuming while developing but it will reduce your processing time.

let me give you an example:

Table1

 id name 1 XXXX 2 YYYY 3 ZZZZ

Table2

 pm_id id address 1 1 Address1 2 1 Address2 3 1 Address3 4 2 Address4 5 2 Address5 6 3 Address6

now if you want change in records related to id=2(Table1), you are having 2 records from Table2 i.e id=4 and id=5

so you are now able to update that records. let me know if i understood something wrong....

Yes, so the idea is that I will get the id=2 with a set of addresses, and I will delete pm_id=4 and pm_id=5 and insert all the new ones. The problem is that to identify if the row was previously inserted or not, because I should compare every column and any slightly different will tell me is new. So, in my case, its probably a safer decision to do what I'm doing to ensure the consistence of the data
but you should update all the rows without checking whether it is changed or not. if some records are changed it will get replaced, and other records also will get replaced but with the same data as earlier. did you get my point?
mmm... That's not a bad idea, but imagine that I receive 2 rows and I have 3 on the table, then I will update 2 and delete the other one no? It makes sense

Community · Accepted Answer · 2017-05-23 12:13:11Z

I might have overlooked you enviroment, but if you are using SqlServer 2008 you should definitely be looking at using the Merge Command. See SQL MERGE statement to update data for a simple example. Easier to use merge than the alternatives too.

Collectives™ on Stack Overflow

Is there any potential performance problems if I update by deleting and inserting

3 Answers 3

6 Comments

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

3 Comments

Comments

Linked

Related