3

i'm batching updates with jdbc

ps = con.prepareStatement(""); ps.addBatch(); ps.executeBatch(); 

but in the background it seems, that the prostgres driver sends the query bit by bit to the database.

org.postgresql.core.v3.QueryExecutorImpl:398

 for (int i = 0; i < queries.length; ++i) { V3Query query = (V3Query)queries[i]; V3ParameterList parameters = (V3ParameterList)parameterLists[i]; if (parameters == null) parameters = SimpleQuery.NO_PARAMETERS; sendQuery(query, parameters, maxRows, fetchSize, flags, trackingHandler); if (trackingHandler.hasErrors()) break; } 

is there a possibility to let him send 1000 a time to speed it up?

4
  • PostgreSQL version? JDBC driver version? What are the queries you're batching? As for the code, it looks like you're talking about execute(Query[] queries, ParameterList[] parameterLists, ...) Commented Sep 28, 2012 at 11:39
  • i'm batching updates, driver is "9.1-901-1.jdbc4". I found the copy api. But i couldn't find any docs abount the format it's reading. Just "from - a CSV file or such" jdbc.postgresql.org/documentation/publicapi/org/postgresql/copy/… Commented Sep 28, 2012 at 11:51
  • The copy command should be a wrapper around COPY used by Postgres in general. More information here on the format and command: postgresql.org/docs/9.1/static/sql-copy.html Commented Sep 28, 2012 at 16:51
  • Filed: github.com/pgjdbc/pgjdbc/issues/15 Commented Sep 28, 2012 at 23:21

1 Answer 1

10

AFAIK is no server-side batching in the fe/be protocol, so PgJDBC can't use it.. Update: Well, I was wrong. PgJDBC (accurate as of 9.3) does send batches of queries to the server if it doesn't need to fetch generated keys. It just queues a bunch of queries up in the send buffer without syncing up with the server after each individual query.

See:

Even when generated keys are requested the extended query protocol is used to ensure that the query text doesn't need to be sent every time, just the parameters.

Frankly, JDBC batching isn't a great solution in any case. It's easy to use for the app developer, but pretty sub-optimal for performance as the server still has to execute every statement individually - though not parse and plan them individually so long as you use prepared statements.

If autocommit is on, performance will be absolutely pathetic because each statement triggers a commit. Even with autocommit off running lots of little statements won't be particularly fast even if you could eliminate the round-trip delays.

A better solution for lots of simple UPDATEs can be to:

  • COPY new data into a TEMPORARY or UNLOGGED table; and
  • Use UPDATE ... FROM to UPDATE with a JOIN against the copied table

For COPY, see the PgJDBC docs and the COPY documentation in the server docs.

You'll often find it's possible to tweak things so your app doesn't have to send all those individual UPDATEs at all.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.