1

I'm using SQL Server and I have a table related N-1 to another table and I have a process that updates the information of the N table by removing all the related rows and inserting them again.

I could do a search in the table and check if the row I'm going to insert already exists, but that will require to compare every column and its a lot easier to me just to deleted them all and inserting them back.

This causes the primary key to grow and grow and I'm concerned about the performance in the future.

So, the question is, will this cause a reduction of the queries in the future and it's a lot better to update the rows or this is not a problem at all?

Thanks

EDIT: I'm not deleting all the rows every time, I'm just deleting the related rows, so if I have 3 new rows to insert, I search the current ones, delete them and insert the new ones

9
  • Which DBMS are you using? Commented Aug 29, 2013 at 10:34
  • SQL Server, just edited the question Commented Aug 29, 2013 at 10:36
  • What is the table structure and queries? Sometimes an UPDATE gets carried out as a DELETE ... INSERT rather than in place anyway. Commented Aug 29, 2013 at 10:41
  • There is an ad table and a prices table. An ad contain several prices, so the prices table is got a foreign key to the ad table. When I update an ad, I delete all the current prices and insert the new ones. This may make the primary key of the prices table to grow a lot and that is my biggest concern Commented Aug 29, 2013 at 10:50
  • 1
    do you have maintenance plans on the database? do you regenerate indexes? if so, the PK grow will not affect performance, since the number of rows will be the same. Commented Aug 29, 2013 at 10:50

3 Answers 3

1

When it comes to SQL Servers internal DML primitives, UPDATE is the cheapest DML to perform. DELETE is 2nd. INSERT is the most expensive one.

A combination of insert and delete is never going to be faster. At best, SQL Server executes the UPDATE as INSERT+DELETE internally. In that case you might end up equal. If the update is processed natively, it will be faster though. How much faster or slower depends on the table structure and data, as well as on the query plan.

That performance loss might be ok, though. You can trade productivity for performance.

Sign up to request clarification or add additional context in comments.

6 Comments

So, the primary key growing instead of keeping it "lower" won't cause a reduction in the performance in the future? I'm not concerned about the speed of the process, so how is done is not a problem, is how the database stays in the future what matters.
What do you mean by the primary key growing? You mean that the number increases? That does not impact performance at all. Row count, row size, page fullness and fragmentation do.
@usr . . . Can you document your first statement? It is not consistent with my experience or knowledge of how databases work. I could never make such a general statement.
@GordonLinoff I understand your suspicion. My statement is targeted at single-B-tree DML because multi-index DML plans get to complex to make such a general statement. Here is the script that I played with to derive this information: pastebin.com/vSCHm5Qz All DML there targets a single CI with 1M rows that fit easily into memory. All computations are simple integer computations (I want to test the infrastructure, not the expression evaluator).
@usr . . . So you understand, an update can require both an insert and a delete, both on the data pages and in the indexes, and it requires logging both the old data and the new data. It can be the least-performant alternative.
|
0

As now you are deleting all the related rows. it means you are having a bunch of records that is related to another table record.

instead of deleting why don't you update it. you are having primary keys of all that records. it might be a time consuming while developing but it will reduce your processing time.

let me give you an example:

Table1

 id name 1 XXXX 2 YYYY 3 ZZZZ 

Table2

 pm_id id address 1 1 Address1 2 1 Address2 3 1 Address3 4 2 Address4 5 2 Address5 6 3 Address6 

now if you want change in records related to id=2(Table1), you are having 2 records from Table2 i.e id=4 and id=5

so you are now able to update that records. let me know if i understood something wrong....

3 Comments

Yes, so the idea is that I will get the id=2 with a set of addresses, and I will delete pm_id=4 and pm_id=5 and insert all the new ones. The problem is that to identify if the row was previously inserted or not, because I should compare every column and any slightly different will tell me is new. So, in my case, its probably a safer decision to do what I'm doing to ensure the consistence of the data
but you should update all the rows without checking whether it is changed or not. if some records are changed it will get replaced, and other records also will get replaced but with the same data as earlier. did you get my point?
mmm... That's not a bad idea, but imagine that I receive 2 rows and I have 3 on the table, then I will update 2 and delete the other one no? It makes sense
0

I might have overlooked you enviroment, but if you are using SqlServer 2008 you should definitely be looking at using the Merge Command. See SQL MERGE statement to update data for a simple example. Easier to use merge than the alternatives too.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.