Tech Talk - JPA and Query Optimization - publish

No more than a few milliseconds Behind JPA and SQL Query Optimizations Gleydson Lima gleydson@esig.com.br https://www.linkedin.com/in/gleydsonlima

Agenda • What is Fast? • Common problems with JPA/Hibernate • JPA/Hibernate Cache Levels • Indexes • Query Explain • App Example • Real App Examples

What is fast for the user? Everything you do not realize loading and feels instantaneous or close to it.

N + 1 SELECT Problem ● Main cases: ○ ManyToOne Associations ○ OneToOne Associations ○ OneToMany Associations

N + 1 SELECT??? 1? 2? Both? 1 2

Demo application https://github.com/gleydsonlima/jpa-and-query-optimizer

Answer ● findAll: N + 1 select happened in this case. The reason is that JPQL by default does not consider the fetch strategy. ● findById: In this case, fetch EAGER is considered to load the entity and its associations with just ONE select.

N + 1 SELECT for OneToMany Use Join Fetch Only one collection per query.

Performance Tips - Use EAGER in main association relationships; - Use LAZY in ManyToOne less used for exhibition; - Pay attention in your console with show_sql = true. Be careful with too many selects. - JQL query does not use FetchType definition!!!

JPA Projection Hibernate: select aluno0_.nome as col_0_0_, curso1_.id as col_1_0_, curso1_.id as id1_1_, curso1_.nome as nome2_1_ from opt.aluno aluno0_ left outer join opt.curso curso1_ on aluno0_.curso_id=curso1_.id where aluno0_.nome like ? limit ? offset ?

The Explain Command PostgreSQL devises a query plan for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex planner that tries to choose good plans. You can use the EXPLAIN command to see what query plan the planner creates for any query

Explain - The simplest example startup cost - in general, for sorting. complete phase cost number of rows fetched bytes recovered. The size of information

Index Scan Read the index tree and gets the heap blocks pointed by the index.

The index_only_scan Read the index tree and returns the data without acessing the heap page.

Bitmap_index/heap scan read the index sequentially generating a bitmap used to recheck on heap pages.

Very Important!! Hibernate table creation does not create INDEX for Foreign Keys!!!

Example in Demo Application Two schemas nopt: no index Opt: with index

Case 01 "Nested Loop (cost=0.29..3821.09 rows=10 width=65)" "Hash Join (cost=7.21..23.64 rows=10 width=57)"

Case 02 "Sort (cost=736.07..736.08 rows=2 width=520)" "Sort (cost=704.07..704.08 rows=2 width=520)"

Case 02 There is no filter. The couting function, in this case, requires go through all table. The planner choose SEQ SCAN even if you have indexes.

Case 03 "Sort (cost=42790786.05..42790786.06 rows=2 width=520)" "Sort (cost=91154.05..91154.06 rows=2 width=520)" 469x more computation units.

Case 04 "HashAggregate (cost=4588.41..4588.44 rows=3 width=12)" "HashAggregate (cost=4144.45..4144.48 rows=3 width=12)"

Case 05 "Nested Loop (cost=535.00..4339.55 rows=500 width=531)" "Nested Loop (cost=523.17..1817.23 rows=498 width=531)"

Example in Real ESIG Applications (internal use only)

Tips • Always use index for relevant foreign keys • Use native query for complex query. • Explain your query • Execute your query in real environment. The average time must be less than 500ms! • Create multi-column index for relevant filters. • Be careful: index uses disk space!

Future talks (Suggestion) Join Operations Nested Loops Joins two tables by fetching the result from one table and querying the other table for each row from the first. Hash Join / Hash The hash join loads the candidate records from one side of the join into a hash table (marked with Hash in the plan) which is then probed for each record from the other side of the join. Merge Join The (sort) merge join combines two sorted lists like a zipper. Both sides of the join must be presorted

Sorting and Grouping Sort / Sort Key Sorts the set on the columns mentioned in Sort Key. The Sortoperation needs large amounts of memory to materialize the intermediate result (not pipelined). GroupAggregate Aggregates a presorted set according to the group by clause. This operation does not buffer large amounts of data (pipelined). HashAggregate Uses a temporary hash table to group records. The HashAggregateoperation does not require a presorted data set, instead it uses large amounts of memory to materialize the intermediate result (not pipelined). The output is not ordered in any meaningful way.

+55 84 3034-9310 : : esig.com.br

Tech Talk - JPA and Query Optimization - publish

More Related Content

What's hot

Similar to Tech Talk - JPA and Query Optimization - publish

Recently uploaded

Tech Talk - JPA and Query Optimization - publish