I use Spring Data, Spring Boot, and Hibernate as JPA provider and I want to improve performance in bulk inserting.
I refer to this link to use batch processing:
http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch15.html
This is my code and my application.properties for insert batching experiment.
My service:
@Value("${spring.jpa.properties.hibernate.jdbc.batch_size}") private int batchSize; @PersistenceContext private EntityManager em; @Override @Transactional(propagation = Propagation.REQUIRED) public SampleInfoJson getSampleInfoByCode(String code) { // SampleInfo newSampleInfo = new SampleInfo(); // newSampleInfo.setId(5L); // newSampleInfo.setCode("SMP-5"); // newSampleInfo.setSerialNumber(10L); // sampleInfoDao.save(newSampleInfo); log.info("starting... inserting..."); for (int i = 1; i <= 5000; i++) { SampleInfo newSampleInfo = new SampleInfo(); // Long id = (long)i + 4; // newSampleInfo.setId(id); newSampleInfo.setCode("SMPN-" + i); newSampleInfo.setSerialNumber(10L + i); // sampleInfoDao.save(newSampleInfo); em.persist(newSampleInfo); if(i%batchSize == 0){ log.info("flushing..."); em.flush(); em.clear(); } } part of application.properties that related to batching:
spring.jpa.properties.hibernate.jdbc.batch_size=100 spring.jpa.properties.hibernate.cache.use_second_level_cache=false spring.jpa.properties.hibernate.order_inserts=true spring.jpa.properties.hibernate.order_updates=true Entity class:
@Entity @Table(name = "sample_info") public class SampleInfo implements Serializable{ private Long id; private String code; private Long serialNumber; @Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "sample_info_seq_gen" ) @SequenceGenerator( name = "sample_info_seq_gen", sequenceName = "sample_info_seq", allocationSize = 1 ) @Column(name = "id") public Long getId() { return id; } public void setId(Long id) { this.id = id; } @Column(name = "code", nullable = false) public String getCode() { return code; } public void setCode(String code) { this.code = code; } @Column(name = "serial_number") public Long getSerialNumber() { return serialNumber; } public void setSerialNumber(Long serialNumber) { this.serialNumber = serialNumber; } } Running the service above batch inserting 5000 rows took 30 to 35 seconds to complete, but if comment these lines:
if(i%batchSize == 0){ log.info("flushing..."); em.flush(); em.clear(); } inserting 5000 rows took only 5 to 7 seconds, faster than batch mode.
Why is it slower when using batch mode?