No more than a few milliseconds Behind JPA and SQL Query Optimizations Gleydson Lima gleydson@esig.com.br https://www.linkedin.com/in/gleydsonlima
Agenda • What is Fast? • Common problems with JPA/Hibernate • JPA/Hibernate Cache Levels • Indexes • Query Explain • App Example • Real App Examples
What is fast for the user? Everything you do not realize loading and feels instantaneous or close to it.
N + 1 SELECT Problem ● Main cases: ○ ManyToOne Associations ○ OneToOne Associations ○ OneToMany Associations
ManyToOne relationship
Example
N + 1 SELECT??? 1? 2? Both? 1 2
Demo application https://github.com/gleydsonlima/jpa-and-query-optimizer
Answer ● findAll: N + 1 select happened in this case. The reason is that JPQL by default does not consider the fetch strategy. ● findById: In this case, fetch EAGER is considered to load the entity and its associations with just ONE select.
Example
Demo application
N + 1 SELECT for OneToMany Use Join Fetch Only one collection per query.
Performance Tips - Use EAGER in main association relationships; - Use LAZY in ManyToOne less used for exhibition; - Pay attention in your console with show_sql = true. Be careful with too many selects. - JQL query does not use FetchType definition!!!
Demo application
JPA Projection Hibernate: select aluno0_.nome as col_0_0_, curso1_.id as col_1_0_, curso1_.id as id1_1_, curso1_.nome as nome2_1_ from opt.aluno aluno0_ left outer join opt.curso curso1_ on aluno0_.curso_id=curso1_.id where aluno0_.nome like ? limit ? offset ?
Demo application
Query Execution - PostgreSQL
The Explain Command PostgreSQL devises a query plan for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex planner that tries to choose good plans. You can use the EXPLAIN command to see what query plan the planner creates for any query
Explain - The simplest example startup cost - in general, for sorting. complete phase cost number of rows fetched bytes recovered. The size of information
SEQ_SCAN
Index Scan Read the index tree and gets the heap blocks pointed by the index.
The index_only_scan Read the index tree and returns the data without acessing the heap page.
Bitmap_index/heap scan read the index sequentially generating a bitmap used to recheck on heap pages.
Index B-Tree
Very Important!! Hibernate table creation does not create INDEX for Foreign Keys!!!
Example in Demo Application Two schemas nopt: no index Opt: with index
Case 01 "Nested Loop (cost=0.29..3821.09 rows=10 width=65)" "Hash Join (cost=7.21..23.64 rows=10 width=57)"
Case 02 "Sort (cost=736.07..736.08 rows=2 width=520)" "Sort (cost=704.07..704.08 rows=2 width=520)"
Case 02 There is no filter. The couting function, in this case, requires go through all table. The planner choose SEQ SCAN even if you have indexes.
Case 03 "Sort (cost=42790786.05..42790786.06 rows=2 width=520)" "Sort (cost=91154.05..91154.06 rows=2 width=520)" 469x more computation units.
Case 04 "HashAggregate (cost=4588.41..4588.44 rows=3 width=12)" "HashAggregate (cost=4144.45..4144.48 rows=3 width=12)"
Case 05 "Nested Loop (cost=535.00..4339.55 rows=500 width=531)" "Nested Loop (cost=523.17..1817.23 rows=498 width=531)"
Example in Real ESIG Applications (internal use only)
Tips • Always use index for relevant foreign keys • Use native query for complex query. • Explain your query • Execute your query in real environment. The average time must be less than 500ms! • Create multi-column index for relevant filters. • Be careful: index uses disk space!
Future talks (Suggestion) Join Operations Nested Loops Joins two tables by fetching the result from one table and querying the other table for each row from the first. Hash Join / Hash The hash join loads the candidate records from one side of the join into a hash table (marked with Hash in the plan) which is then probed for each record from the other side of the join. Merge Join The (sort) merge join combines two sorted lists like a zipper. Both sides of the join must be presorted
Sorting and Grouping Sort / Sort Key Sorts the set on the columns mentioned in Sort Key. The Sortoperation needs large amounts of memory to materialize the intermediate result (not pipelined). GroupAggregate Aggregates a presorted set according to the group by clause. This operation does not buffer large amounts of data (pipelined). HashAggregate Uses a temporary hash table to group records. The HashAggregateoperation does not require a presorted data set, instead it uses large amounts of memory to materialize the intermediate result (not pipelined). The output is not ordered in any meaningful way.
Thanks!
+55 84 3034-9310 : : esig.com.br

Tech Talk - JPA and Query Optimization - publish

  • 1.
    No more thana few milliseconds Behind JPA and SQL Query Optimizations Gleydson Lima gleydson@esig.com.br https://www.linkedin.com/in/gleydsonlima
  • 2.
    Agenda • What isFast? • Common problems with JPA/Hibernate • JPA/Hibernate Cache Levels • Indexes • Query Explain • App Example • Real App Examples
  • 3.
    What is fastfor the user? Everything you do not realize loading and feels instantaneous or close to it.
  • 4.
    N + 1SELECT Problem ● Main cases: ○ ManyToOne Associations ○ OneToOne Associations ○ OneToMany Associations
  • 5.
  • 6.
  • 7.
    N + 1SELECT??? 1? 2? Both? 1 2
  • 8.
  • 9.
    Answer ● findAll: N+ 1 select happened in this case. The reason is that JPQL by default does not consider the fetch strategy. ● findById: In this case, fetch EAGER is considered to load the entity and its associations with just ONE select.
  • 10.
  • 11.
  • 12.
    N + 1SELECT for OneToMany Use Join Fetch Only one collection per query.
  • 13.
    Performance Tips - UseEAGER in main association relationships; - Use LAZY in ManyToOne less used for exhibition; - Pay attention in your console with show_sql = true. Be careful with too many selects. - JQL query does not use FetchType definition!!!
  • 16.
  • 17.
    JPA Projection Hibernate: selectaluno0_.nome as col_0_0_, curso1_.id as col_1_0_, curso1_.id as id1_1_, curso1_.nome as nome2_1_ from opt.aluno aluno0_ left outer join opt.curso curso1_ on aluno0_.curso_id=curso1_.id where aluno0_.nome like ? limit ? offset ?
  • 18.
  • 19.
  • 20.
    The Explain Command PostgreSQLdevises a query plan for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex planner that tries to choose good plans. You can use the EXPLAIN command to see what query plan the planner creates for any query
  • 21.
    Explain - Thesimplest example startup cost - in general, for sorting. complete phase cost number of rows fetched bytes recovered. The size of information
  • 22.
  • 23.
    Index Scan Read theindex tree and gets the heap blocks pointed by the index.
  • 24.
    The index_only_scan Read the indextree and returns the data without acessing the heap page.
  • 25.
    Bitmap_index/heap scan read the indexsequentially generating a bitmap used to recheck on heap pages.
  • 26.
  • 29.
    Very Important!! Hibernate tablecreation does not create INDEX for Foreign Keys!!!
  • 30.
    Example in DemoApplication Two schemas nopt: no index Opt: with index
  • 31.
    Case 01 "Nested Loop(cost=0.29..3821.09 rows=10 width=65)" "Hash Join (cost=7.21..23.64 rows=10 width=57)"
  • 32.
    Case 02 "Sort (cost=736.07..736.08rows=2 width=520)" "Sort (cost=704.07..704.08 rows=2 width=520)"
  • 33.
    Case 02 There isno filter. The couting function, in this case, requires go through all table. The planner choose SEQ SCAN even if you have indexes.
  • 34.
    Case 03 "Sort (cost=42790786.05..42790786.06 rows=2width=520)" "Sort (cost=91154.05..91154.06 rows=2 width=520)" 469x more computation units.
  • 35.
    Case 04 "HashAggregate (cost=4588.41..4588.44 rows=3width=12)" "HashAggregate (cost=4144.45..4144.48 rows=3 width=12)"
  • 36.
    Case 05 "Nested Loop(cost=535.00..4339.55 rows=500 width=531)" "Nested Loop (cost=523.17..1817.23 rows=498 width=531)"
  • 37.
    Example in RealESIG Applications (internal use only)
  • 38.
    Tips • Always useindex for relevant foreign keys • Use native query for complex query. • Explain your query • Execute your query in real environment. The average time must be less than 500ms! • Create multi-column index for relevant filters. • Be careful: index uses disk space!
  • 39.
    Future talks (Suggestion) JoinOperations Nested Loops Joins two tables by fetching the result from one table and querying the other table for each row from the first. Hash Join / Hash The hash join loads the candidate records from one side of the join into a hash table (marked with Hash in the plan) which is then probed for each record from the other side of the join. Merge Join The (sort) merge join combines two sorted lists like a zipper. Both sides of the join must be presorted
  • 40.
    Sorting and Grouping Sort/ Sort Key Sorts the set on the columns mentioned in Sort Key. The Sortoperation needs large amounts of memory to materialize the intermediate result (not pipelined). GroupAggregate Aggregates a presorted set according to the group by clause. This operation does not buffer large amounts of data (pipelined). HashAggregate Uses a temporary hash table to group records. The HashAggregateoperation does not require a presorted data set, instead it uses large amounts of memory to materialize the intermediate result (not pipelined). The output is not ordered in any meaningful way.
  • 41.
  • 42.
    +55 84 3034-9310: : esig.com.br