I have a requirement to perform clean insert (delete + insert), a huge number of records (close to 100K) per requests. For sake testing purpose, I'm testing my code with 10K. With 10K also, the operation is running for 30 secs, which is not acceptable. I'm doing some level of batch inserts provided by spring-data-JPA. However, the results are not satisfactory.
My code looks like below
@Transactional public void saveAll(HttpServletRequest httpRequest){ List<Person> persons = new ArrayList<>(); try(ServletInputStream sis = httpRequest.getInputStream()){ deletePersons(); //deletes all persons based on some criteria while((Person p = nextPerson(sis)) != null){ persons.add(p); if(persons.size() % 2000 == 0){ savePersons(persons); //uses Spring repository to perform saveAll() and flush() persons.clear(); } } savePersons(persons); //uses Spring repository to perform saveAll() and flush() persons.clear(); } } @Transactional public void savePersons(List<Persons> persons){ System.out.println(new Date()+" Before save"); repository.saveAll(persons); repository.flush(); System.out.println(new Date()+" After save"); } I have also set below properties
spring.jpa.properties.hibernate.jdbc.batch_size=40 spring.jpa.properties.hibernate.order_inserts=true spring.jpa.properties.hibernate.order_updates=true spring.jpa.properties.hibernate.jdbc.batch_versioned_data=true spring.jpa.properties.hibernate.id.new_generator_mappings=false Looking at logs, I noticed that the insert operation is taking around 3 - 4 secs to save 2000 records, but not much on iteration. So I believe the time taken to read through the stream is not a bottleneck. But the inserts are. I also checked the logs and confirm that Spring is doing a batch of 40 inserts as per the property set.
I'm trying to see, if there is a way, I can improve the performance, by using multiple threads (say 2 threads) that would read from a blocking queue, and once accumulated say 2000 records, will call save. I hope, in theory, this may provide better results. But the problem is as I read, Spring manages Transactions at the thread level, and Transaction can not propagate across threads. But I need the whole operation (delete + insert) as atomic. I looked into few posts about Spring transaction management and could not get into the correct direction.
Is there a way I can achieve this kind of parallelism using Spring transactions? If Spring transactions is not the answer, are there any other techniques that can be used?
Thanks