4

I have a DAO that uses JPA/Hibernate to fetch data from SQL Server 2008 R2. In my specific usecase, i am doing a simple SELECT query that returns about 100000 records, but takes more than 35 minutes to do so.

I created a basic JDBC connection by manually loading the sql server driver and the same query returned 100k records in 15 seconds. So something is very wrong in my configuration.

Here is my springDataContext.xml:

<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:jpa="http://www.springframework.org/schema/data/jpa" xmlns:context="http://www.springframework.org/schema/context" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:tx="http://www.springframework.org/schema/tx" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/data/jpa http://www.springframework.org/schema/data/jpa/spring-jpa-1.3.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-3.1.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.1.xsd"> <bean id="jpaVendorAdapter" class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"/> <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> <property name="dataSource" ref="myDataSource"/> <property name="jpaVendorAdapter" ref="jpaVendorAdapter"/> <property name="persistenceXmlLocation" value="classpath:persistence.xml"/> <property name="jpaProperties"> <props> <prop key="hibernate.dialect">org.hibernate.dialect.SQLServer2008Dialect</prop> <prop key="hibernate.ejb.naming_strategy">org.hibernate.cfg.ImprovedNamingStrategy</prop> <prop key="hibernate.format_sql">true</prop> <prop key="hibernate.hbm2ddl.auto">none</prop> <prop key="hibernate.use_sql_comments">false</prop> <prop key="hibernate.show_sql">${hibernate.show_sql:false}</prop> <prop key="jadira.usertype.autoRegisterUserTypes">true</prop> <prop key="jadira.usertype.javaZone">America/Chicago</prop> <prop key="jadira.usertype.databaseZone">America/Chicago</prop> <prop key="jadira.usertype.useJdbc42Apis">false</prop> </props> </property> </bean> <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactory"/> </bean> <tx:annotation-driven/> <jpa:repositories base-package="com.mycompany.foo"/> <context:component-scan base-package="com.mycompany.foo.impl" /> </beans> 

The bean myDataSource is provided by whatever app consumes the DAO, which is defined like so:

<bean id="myDataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource"> <property name="driverClassName" value="com.microsoft.sqlserver.jdbc.SQLServerDriver"/> <property name="url" value="jdbc:sqlserver://mysqlhost.mycompany.com:1433;database=MYDB"/> <property name="username" value="username"/> <property name="password" value="chucknorris"/> </bean> 

I have a complex query wherein I am setting fetchSize:

package com.mycompany.foo.impl; import com.mycompany.foo.RecurringLoanPaymentAccountsDao; import org.hibernate.Query; import org.hibernate.ScrollMode; import org.hibernate.ScrollableResults; import org.hibernate.Session; import org.joda.time.DateTime; import javax.inject.Named; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; @Named public class FooDaoImpl implements FooDao { @PersistenceContext private EntityManager entityManager; public ScrollableResults getData(int chunkSize, DateTime tomorrow, DateTime anotherDate) { Session session = entityManager.unwrap(Session.class); Query query = session.createQuery( "from MyAccountTable as account " + // bunch of left joins (I am using @JoinColumns in my models) "where account.some_date >= :tomorrow " + "and account.some_other_date < :anotherDate" // <snip snip> ); query.setParameter("tomorrow", tomorrow) .setParameter("anotherDate", anotherDate) .setFetchSize(chunkSize); return query.scroll(ScrollMode.FORWARD_ONLY); } } 

I also switched to vanilla JPA and did jpaQuery.getResultList(), but that was equally slow.

I can provide other relevant code if required. Any clue on what I'm doing wrong here?

Update 1: Add schema details

Unfortunately, I work for a bank, so I cannot share the exact code. But let me try to represent the relevant bits.

Here is my main table that I am querying:

@Entity @Table(name = "MY_TABLE") public class MyTable { @EmbeddedId private Id id = new Id(); @Column(name = "SOME_COL") private String someColumn; // other columns @OneToMany(fetch = FetchType.LAZY) @JoinColumns({ @JoinColumn(name = "ACCT_NBR", referencedColumnName = "ACCT_NBR", insertable = false, updatable = false), @JoinColumn(name = "CUST_NBR", referencedColumnName = "CUST_NBR", insertable = false, updatable = false) }) private List<ForeignTable> foreignTable; // getter and setter for properties public static class Id implements Serializable { @Column(name = "ACCT_NBR") private short accountNumber; @Column(name = "CUST_NBR") private short customerNumber; } } 

Basically, every table has ACCT_NBR and CUST_NBR columns (including the foreign table) which are unique when clustered together. So my join condition includes both of these.

The model for the foreign table looks exactly the same (with and embedded ID like the main table above), of course with its own set of columns in addition to the ACCT_NBR and CUST_NBR.

Now i only care about 2 other columns in the foreign table apart from the ID columns mentioned above: TYPE_ID and ACCT_BALANCE.

TYPE_ID is what I wish to use in my LEFT JOIN condition, so that the final SQL looks like this:

LEFT JOIN FOREIGN_TABLE FRN ON MAIN.ACCT_NBR = FRN.ACCT_NBR AND MAIN.CUST_NBR = FRN.CUST_NBR AND FRN.TYPE_ID = <some_static_id> 

I want the LEFT JOIN to yield NULL data when there is no match for this specific TYPE_ID.

And in my SELECT claus, I select FOREIGN_TABLE.ACCT_BALANCE, which of course will be null if the left join above has no matching row.

This method is invoked from a spring batch application's Tasklet, and I have a feeling that it gave a null pointer exception once the tasklet finished processing and therefore, the readonly transaction was closed.

8
  • Can you make sure your statement is readonly ? Hibernate is very bad at handling a lots of object in its session. Also, check if your JDBC connection has cursors enabled, by default JDBC copies everything in memory instead of using the database's cursor Commented Dec 30, 2015 at 4:02
  • You should maybe consider JTDS as an alternative driver, too. The one provided by Microsoft has a few violent bugs, and is a lot slower then JTDS. Commented Dec 30, 2015 at 4:08
  • I have a @Transactional(readOnly = true) annotation on the method that calls this DAO method. How can I mark this statement as read-only? Do I need to mark the entity as read only by removing the setters? Commented Dec 30, 2015 at 4:08
  • @Transactional(readOnly = true) is correct. Hibernate shouldn't do dirty checks on your session objects. That will speed-up the query a lot. Commented Dec 30, 2015 at 4:09
  • Well, unfortunately, I already have that in there. And it still is painfully slow. Let me look at JTDS. Being a .net developer, all this is very daunting for me :) Commented Dec 30, 2015 at 4:10

1 Answer 1

6

In the comments you point out that your @Joincolumn are or FetchType.EAGER. This is very aggressive and your problem.

Eager means Hibernate will JOIN FETCH all said columns in your query, parse and load all the records to dispatch the data in new instances of your entities. You may know that, if you join table A and B, the result will be a huge A.*, B.* with A x B records and A repeated many times. This is what's going on with Eager. That can escalate very quickly, especially with many Eager columns.

Switching to Lazy tells Hibernate to not load the data. It just prepares a Proxy object which will call a separate SELECT only when required (if your transaction is still open).

You can always force a FETCH manually in your HQL with JOIN FETCH table. This is the preferable way.

Sign up to request clarification or add additional context in comments.

7 Comments

Now i have a different issue, I get a NullPointerException when I try to access the joined entity. And i cannot use JOIN FETCH either as I have a "with" claus (since I have to join the foreign table with one of the columns = some static value). Any ideas?
It really depends on your schema. I am blind without seeing the actual code. Accessing a Lazy getter, if the transaction is open, should call a SELECT. A NullPointerException is unlikely to be the direct cause. Can you post more code in your question as an addendum?
I've added some details. I apologize that I cannot post the exact code as my company has rules against it. I've tried my best to explain the schema, but please let me know if I missed out something.
Not really to square one, you identified the problem why your query is slow. You just don't have the right solution yet.
I have eliminated the join altogether. Thanks a lot for your help on this issue! Have a great new year ahead :)
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.