0

The question is about SQL legacy code for MySQL database.

It is known, that when doing INSERT ... ON DUPLICATE KEY UPDATE statement VALUES(col_name) function can be used to refer to column values from the INSERT portion instead of passing there exact values:

INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE b=VALUES(b), c=VALUES(c) 

My legacy code contains a lot of huge inserts in parametrized style (they are used in batch-inserts):

INSERT INTO table (a,b,c, <...dozens of params...>) VALUES (?,?,?,<...dozens of values...>) ON DUPLICATE KEY UPDATE b=?, c=?, <...dozens of params...> 

The question is: would it increase performance of batch-inserts if I will change all these queries to use VALUES(col_name) function (in UPDATE portion)?

My queries are executed from java code using jdbc driver. So, what I guess, is that for long text values it should significantly reduce size of queries. What about MySQL it self? Would it really in general give me increasing of speed?

3
  • Yes for long values you might get a marginal benefit but this sort of effort is hardly worth it unless you are inserting a lot of records per minute Commented Jun 3, 2016 at 10:04
  • I'm inserting a lot of records in minute, indeed Commented Jun 3, 2016 at 10:33
  • guess it would then be worthy it since the change wont take up too much of your time. Commented Jun 3, 2016 at 10:35

1 Answer 1

-1

Batched inserts can may run 10 times as fast and one row at a time. The reason for this is all the network, etc, overhead.

Another technique is to change from a single batched IODKU into two statements -- one to insert the new rows, one to do the updates. (I don't know if that will run any faster.) Here is a discussion of the two steps, in the context of "normalization".

Another thing to note: If there is an AUTO_INCREMENT involved (not as one of the columns mentioned), then IODKU may "burn" ids for the cases where it does an 'update'. That is, the IODKU (and INSERT IGNORE and a few others) get all the auto_incs that it might need, then proceeds to use the ones it does need and waste the others.

You get into "diminishing returns" if you try to insert more than a few hundred rows in a batch. And you stress the rollback log.

Sign up to request clarification or add additional context in comments.

10 Comments

You didn't read the question: I'm already using BATCH instertion. The question was about using VALUES function in update part.
Really? Why it's wrong? I'm using Spring JDBC batch inserts, so syntax is correct in that case
In MySQL, a batch insert looks like: INSERT INTO table (a,b,c) VALUES (1,2,3), (4,5,6), (11,22,33), .... That is, N column names, plus M sets of N values to insert M rows into N columns. Are you saying that Spring has a different syntax?
that's a really good question. It doesn't directly relate to my question and doesn't answer on it, but should be elaborated separately.
well I've looked into MySQL PreparedStatement sources and it has special method for conversion source SQL into batch one. So, yes, Spring internally calls mysql jdbc driver's PrepatedStatement's implementation and use this correct batch syntax.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.