Uncovering SQL Server Query Problems with Execution Plans Tony Davis tony.davis@red-gate.com @tonytheeditor #SQLintheCityUK
About Me Books SQL Server transaction log SQL Server Source Control Speaker #SQLintheCityUK
SELECT ProductID , ProductNumber , dbo.LineItemTotal(ProductID) AS SumTotal FROM Production.Product p GO SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO Why Bother Learning Execution Plans? WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals;
From Query to Execution plan SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO #SQLintheCityUK
Plan 1 Plan 2 Plan 3 Plan 4 Plan 57 Plan n? Algebrizer (Query Binding) Syntax Checking (Query Parsing) Query Optimization Plan Query Execution Engine Plan Cache Metadata Table/index structures SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO Index/Column Statistics Volume/distribution of data #SQLintheCityUK
Getting the Execution Plan .NET Code Profiler - e.g. ANTS Performance Profiler Get plans for previously executed queries from the plan cache - e.g. SQL Trace or Extended Events - Using sys.dm_exec_cached_plans Plans in cache contain no runtime information  #SQLintheCityUK
#SQLintheCityUK
Getting the Execution Plan .NET Code Profiler - e.g. ANTS Performance Profiler Get plans for previously executed queries from the plan cache - e.g. SQL Trace or Extended Events - Using sys.dm_exec_cached_plans Plans in cache contain no runtime information  SET During testing, request the plan for a query • “estimated” plan – no runtime • “actual” plan - with runtime There is only 1 plan!  #SQLintheCityUK
#SQLintheCityUK
Execution Plans for Developers • Don’t examine plan for every query • Gather stats, focus on critical, frequent, resource- intensive queries • Sometimes the code logic is just wrong – rip it up and start again! • Sometimes the basic logic is fine – But some subtler problem causes poor execution performance – This is where execution plans can help! #SQLintheCityUK
• With SQL Server: • We submit SQL describing the data set we want • The Optimizer decides how to execute it • Some developers bring imperative approach line- by-line control to SQL Server… Problem: Row-by-row strategy #SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK
Solution: set-based approach – Take the list to the supermarket ;) – Define single statement (if possible) to return required data set • Make as few ‘passes’ through the base tables as possible • Aggregate early • Reduce the working set to as few rows as possible #SQLintheCityUK
WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals; Classic Running Total Problem A Registrations table containing a list of subscribers, with the dates that the subscribers joined and left #SQLintheCityUK
WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals; A single ‘pass’ of the base table Early Aggregation 10000001690 Execution Plan #SQLintheCityUK
Problem: Excessive logical reads • “Read 1000 pages to return 100 rows” – 1 page = 1 logical read – Goal: return data with few logical reads as possible – Sometimes we force SQL Server to read more pages than necessary (excessive IO) Inefficient data access paths (indexes) Inefficient SQL forcing SQL Server to do excessive work  #SQLintheCityUK
Name (Index Key) A - Z + LocationID Name (Index Key) A - E + LocationID Name (Index Key) F - I + LocationID Name (Index Key) J - M + LocationID Name (Index Key) N - R + LocationID Name (Index Key) S - V + LocationID Name (Index Key) W - Z + LocationID Root Node Intermediate Level Leaf Level Name (Index Key) A - M + LocationID Name (Index Key) N - Z + LocationID WHERE l.Name ='Paint';Non-clustered Index #SQLintheCityUK
Name (Index Key) A - Z + LocationID Name (Index Key) A - E + LocationID Name (Index Key) F - I + LocationID Name (Index Key) J - M + LocationID Name (Index Key) N - R + LocationID Name (Index Key) S - V + LocationID Name (Index Key) W - Z + LocationID Root Node Intermediate Level Leaf Level Name (Index Key) A - M + LocationID Name (Index Key) N - Z + LocationID WHERE l.Name ='Paint';Non-clustered Index #SQLintheCityUK
LocationID = 6; (ProductID, LocationID) (Clustering Key) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level(502,6) (4325,6) (10324,6)(7865,6) Clustered Index #SQLintheCityUK
LocationID = 6; (ProductID, LocationID) (Clustering Key) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level Clustered Index #SQLintheCityUK
LocationID = 6; (ProductID, LocationID) (Clustering Key) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level #SQLintheCityUK
#SQLintheCityUK
ProductID (Index Key) A - Z + SalesOrderID ProductID (Index Key) 1 - 150 + SalesOrderID ProductID (Index Key) 151 - 300 + SalesOrderID ProductID (Index Key) 301 - 450 + SalesOrderID ProductID (Index Key) 451 - 600 + SalesOrderID ProductID (Index Key) 601 - 750 + SalesOrderID ProductID (Index Key) 751 - 900 + SalesOrderID Root Node Intermediate Level Leaf Level ProductID (Index Key) A - M + SalesOrderID ProductID (Index Key) N - Z + SalesOrderID Lookup CarrierTrackingNumber in clustered index for each row Key Lookup SELECT ProductID , SalesOrderID , CarrierTrackingNumber FROM Sales.SalesOrderDetail WHERE ProductID = 709; #SQLintheCityUK
#SQLintheCityUK
#SQLintheCityUK • For small tables, scanning an index is very efficient operation • BUT…for critical and frequently-executed queries: – Provide covering non-clustered indexes that the queries can seek – Minimize logical reads (IO) required to gather the data Solution: optimize indexes for workloadSolution: optimize indexes for workload
#SQLintheCityUK –Worst Strategy: No indexes. Query performance will be terrible –Second worst: Index on every column; same column participating in numerous indexes Data modification performance will be awful –Don’t do “SELECT *” Makes effective indexing VERY hard! Solution: optimize indexes for workloadSolution: optimize indexes for workload
#SQLintheCityUK • Don’t index query by query • Goal: small set of indexes to help most important and frequent queries • Analyze the workload as a whole – Profiler and Database Engine Tuning Advisor • Good start point… • Not necessarily good end point – Refine recommendations using Missing Index (and other) DMVs • What is the write : read ratio (>1?) for the index? • How many plans currently associated with index? • How often are those plans used? • How many reads might the missing index have helped? Don’t create if this number is small… – See Performance Tuning with SQL Server Dynamic Management Views (free eBook) • http://bit.ly/1jVG3IW Solution: optimize indexes for workload
“The explicit data type conversion has rendered the predicate non-SARGable!” Problem: Non-SARGable Predicates #SQLintheCityUK
Beware of SARG-ability • Non-SARGable means that the optimizer can’t use the expression in a seek operation • Cause: use of function directly on column in WHERE or JOIN – Various incarnations – All lead to excessive IO (scans when seeks should be viable) #SQLintheCityUK
SARG-able predicate …WHERE FirstName = 'bob';  SARGable FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... #SQLintheCityUK
Non SARGable predicate …WHERE REVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Aaron') = 'bob' ? #SQLintheCityUK
Non SARG-able predicate …WHERE REVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Aaron') = 'bob' ? #SQLintheCityUK
Non SARGable predicate …WHERE REVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Amber') = 'bob' ?  Non-SARGable #SQLintheCityUK
#SQLintheCityUK
Other common explicit conversions -- Using functions like LTRIM, RTRIM etc. -- often happens where people don't trust consistency of data inputs SELECT bp.ProductID , bp.Name , bp.ProductNumber FROM dbo.bigProduct bp WHERE LTRIM(bp.ProductNumber) LIKE 'AR%'; GO SELECT bp.ProductID , bp.Name , bp.ProductNumber FROM dbo.bigProduct bp WHERE bp.ProductNumber LIKE N'%1000'; GO #SQLintheCityUK
Other causes of Non-SARG • Implicit Data type conversions – Variable data type doesn’t match column type – SQL Server uses CONVERT_IMPLICIT – Problem if column is lower precedence type • Data type precedence order: http://bit.ly/1ENOwjZ • Misuse of User Defined Functions (UDFs) #SQLintheCityUK
• Rewrite the query to avoid direct use of function on column • Use helper functions • Avoid data type mismatches • Be careful when using UDFs! – Convert scalar UDFs to inline TVFs – http://bit.ly/1oN9ysK Solutions to Non-SARGability #SQLintheCityUK
• One query size fits all • Leads to wildly inaccurate estimations and inappropriate execution plans Problem: Overly Generic SQL #SQLintheCityUK
#SQLintheCityUK
• Write SQL and stored procedures for a specific, defined purpose • Various ‘workarounds’ though none without drawbacks – Recompile on each execution – Dynamic SQL – See Gail Shaw’s “How to Confuse the Query Optimizer” (http://bit.ly/1Mb8Rnp) Solutions to Generic SQL #SQLintheCityUK
Other issues that cause poor estimations • Stale statistics • Problems with parameter sniffing #SQLintheCityUK
Very wild estimation! #SQLintheCityUK
Conclusions • Plans tell us exactly how SQL Server executed your query – Focus on critical and frequently-executed queries – Use plans to: • spot common mistakes in the code • Uncover problems with inefficient indexing – Work with rather than against SQL Server #SQLintheCityUK
Thank you to… • Grant Fritchey - @GFritchey • Hugo Kornelis - @Hugo_Kornelis • Gail Shaw - @SQLintheWild • Rodney Landrum - @SQLBeat • Phil Factor - @Phil_Factor #SQLintheCityUK

Uncovering SQL Server query problems with execution plans - Tony Davis

  • 1.
    Uncovering SQL ServerQuery Problems with Execution Plans Tony Davis tony.davis@red-gate.com @tonytheeditor #SQLintheCityUK
  • 2.
    About Me Books SQL Servertransaction log SQL Server Source Control Speaker #SQLintheCityUK
  • 3.
    SELECT ProductID , ProductNumber, dbo.LineItemTotal(ProductID) AS SumTotal FROM Production.Product p GO SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO Why Bother Learning Execution Plans? WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals;
  • 4.
    From Query toExecution plan SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO #SQLintheCityUK
  • 5.
    Plan 1 Plan 2 Plan 3 Plan 4 Plan 57 Plan n? Algebrizer (Query Binding) Syntax Checking (QueryParsing) Query Optimization Plan Query Execution Engine Plan Cache Metadata Table/index structures SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO Index/Column Statistics Volume/distribution of data #SQLintheCityUK
  • 6.
    Getting the ExecutionPlan .NET Code Profiler - e.g. ANTS Performance Profiler Get plans for previously executed queries from the plan cache - e.g. SQL Trace or Extended Events - Using sys.dm_exec_cached_plans Plans in cache contain no runtime information  #SQLintheCityUK
  • 7.
  • 8.
    Getting the ExecutionPlan .NET Code Profiler - e.g. ANTS Performance Profiler Get plans for previously executed queries from the plan cache - e.g. SQL Trace or Extended Events - Using sys.dm_exec_cached_plans Plans in cache contain no runtime information  SET During testing, request the plan for a query • “estimated” plan – no runtime • “actual” plan - with runtime There is only 1 plan!  #SQLintheCityUK
  • 9.
  • 10.
    Execution Plans forDevelopers • Don’t examine plan for every query • Gather stats, focus on critical, frequent, resource- intensive queries • Sometimes the code logic is just wrong – rip it up and start again! • Sometimes the basic logic is fine – But some subtler problem causes poor execution performance – This is where execution plans can help! #SQLintheCityUK
  • 11.
    • With SQLServer: • We submit SQL describing the data set we want • The Optimizer decides how to execute it • Some developers bring imperative approach line- by-line control to SQL Server… Problem: Row-by-row strategy #SQLintheCityUK
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    Solution: set-based approach –Take the list to the supermarket ;) – Define single statement (if possible) to return required data set • Make as few ‘passes’ through the base tables as possible • Aggregate early • Reduce the working set to as few rows as possible #SQLintheCityUK
  • 33.
    WITH Totals AS (SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals; Classic Running Total Problem A Registrations table containing a list of subscribers, with the dates that the subscribers joined and left #SQLintheCityUK
  • 34.
    WITH Totals AS (SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals; A single ‘pass’ of the base table Early Aggregation 10000001690 Execution Plan #SQLintheCityUK
  • 35.
    Problem: Excessive logicalreads • “Read 1000 pages to return 100 rows” – 1 page = 1 logical read – Goal: return data with few logical reads as possible – Sometimes we force SQL Server to read more pages than necessary (excessive IO) Inefficient data access paths (indexes) Inefficient SQL forcing SQL Server to do excessive work  #SQLintheCityUK
  • 36.
    Name (Index Key) A -Z + LocationID Name (Index Key) A - E + LocationID Name (Index Key) F - I + LocationID Name (Index Key) J - M + LocationID Name (Index Key) N - R + LocationID Name (Index Key) S - V + LocationID Name (Index Key) W - Z + LocationID Root Node Intermediate Level Leaf Level Name (Index Key) A - M + LocationID Name (Index Key) N - Z + LocationID WHERE l.Name ='Paint';Non-clustered Index #SQLintheCityUK
  • 37.
    Name (Index Key) A -Z + LocationID Name (Index Key) A - E + LocationID Name (Index Key) F - I + LocationID Name (Index Key) J - M + LocationID Name (Index Key) N - R + LocationID Name (Index Key) S - V + LocationID Name (Index Key) W - Z + LocationID Root Node Intermediate Level Leaf Level Name (Index Key) A - M + LocationID Name (Index Key) N - Z + LocationID WHERE l.Name ='Paint';Non-clustered Index #SQLintheCityUK
  • 38.
    LocationID = 6; (ProductID, LocationID) (ClusteringKey) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level(502,6) (4325,6) (10324,6)(7865,6) Clustered Index #SQLintheCityUK
  • 39.
    LocationID = 6; (ProductID, LocationID) (ClusteringKey) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level Clustered Index #SQLintheCityUK
  • 40.
    LocationID = 6; (ProductID, LocationID) (ClusteringKey) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level #SQLintheCityUK
  • 41.
  • 42.
    ProductID (Index Key) A -Z + SalesOrderID ProductID (Index Key) 1 - 150 + SalesOrderID ProductID (Index Key) 151 - 300 + SalesOrderID ProductID (Index Key) 301 - 450 + SalesOrderID ProductID (Index Key) 451 - 600 + SalesOrderID ProductID (Index Key) 601 - 750 + SalesOrderID ProductID (Index Key) 751 - 900 + SalesOrderID Root Node Intermediate Level Leaf Level ProductID (Index Key) A - M + SalesOrderID ProductID (Index Key) N - Z + SalesOrderID Lookup CarrierTrackingNumber in clustered index for each row Key Lookup SELECT ProductID , SalesOrderID , CarrierTrackingNumber FROM Sales.SalesOrderDetail WHERE ProductID = 709; #SQLintheCityUK
  • 43.
  • 44.
    #SQLintheCityUK • For smalltables, scanning an index is very efficient operation • BUT…for critical and frequently-executed queries: – Provide covering non-clustered indexes that the queries can seek – Minimize logical reads (IO) required to gather the data Solution: optimize indexes for workloadSolution: optimize indexes for workload
  • 45.
    #SQLintheCityUK –Worst Strategy: Noindexes. Query performance will be terrible –Second worst: Index on every column; same column participating in numerous indexes Data modification performance will be awful –Don’t do “SELECT *” Makes effective indexing VERY hard! Solution: optimize indexes for workloadSolution: optimize indexes for workload
  • 46.
    #SQLintheCityUK • Don’t indexquery by query • Goal: small set of indexes to help most important and frequent queries • Analyze the workload as a whole – Profiler and Database Engine Tuning Advisor • Good start point… • Not necessarily good end point – Refine recommendations using Missing Index (and other) DMVs • What is the write : read ratio (>1?) for the index? • How many plans currently associated with index? • How often are those plans used? • How many reads might the missing index have helped? Don’t create if this number is small… – See Performance Tuning with SQL Server Dynamic Management Views (free eBook) • http://bit.ly/1jVG3IW Solution: optimize indexes for workload
  • 47.
    “The explicit datatype conversion has rendered the predicate non-SARGable!” Problem: Non-SARGable Predicates #SQLintheCityUK
  • 48.
    Beware of SARG-ability •Non-SARGable means that the optimizer can’t use the expression in a seek operation • Cause: use of function directly on column in WHERE or JOIN – Various incarnations – All lead to excessive IO (scans when seeks should be viable) #SQLintheCityUK
  • 49.
    SARG-able predicate …WHERE FirstName= 'bob';  SARGable FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... #SQLintheCityUK
  • 50.
    Non SARGable predicate …WHEREREVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Aaron') = 'bob' ? #SQLintheCityUK
  • 51.
    Non SARG-able predicate …WHEREREVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Aaron') = 'bob' ? #SQLintheCityUK
  • 52.
    Non SARGable predicate …WHEREREVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Amber') = 'bob' ?  Non-SARGable #SQLintheCityUK
  • 53.
  • 54.
    Other common explicitconversions -- Using functions like LTRIM, RTRIM etc. -- often happens where people don't trust consistency of data inputs SELECT bp.ProductID , bp.Name , bp.ProductNumber FROM dbo.bigProduct bp WHERE LTRIM(bp.ProductNumber) LIKE 'AR%'; GO SELECT bp.ProductID , bp.Name , bp.ProductNumber FROM dbo.bigProduct bp WHERE bp.ProductNumber LIKE N'%1000'; GO #SQLintheCityUK
  • 55.
    Other causes ofNon-SARG • Implicit Data type conversions – Variable data type doesn’t match column type – SQL Server uses CONVERT_IMPLICIT – Problem if column is lower precedence type • Data type precedence order: http://bit.ly/1ENOwjZ • Misuse of User Defined Functions (UDFs) #SQLintheCityUK
  • 56.
    • Rewrite thequery to avoid direct use of function on column • Use helper functions • Avoid data type mismatches • Be careful when using UDFs! – Convert scalar UDFs to inline TVFs – http://bit.ly/1oN9ysK Solutions to Non-SARGability #SQLintheCityUK
  • 57.
    • One querysize fits all • Leads to wildly inaccurate estimations and inappropriate execution plans Problem: Overly Generic SQL #SQLintheCityUK
  • 58.
  • 59.
    • Write SQLand stored procedures for a specific, defined purpose • Various ‘workarounds’ though none without drawbacks – Recompile on each execution – Dynamic SQL – See Gail Shaw’s “How to Confuse the Query Optimizer” (http://bit.ly/1Mb8Rnp) Solutions to Generic SQL #SQLintheCityUK
  • 60.
    Other issues thatcause poor estimations • Stale statistics • Problems with parameter sniffing #SQLintheCityUK
  • 61.
  • 62.
    Conclusions • Plans tellus exactly how SQL Server executed your query – Focus on critical and frequently-executed queries – Use plans to: • spot common mistakes in the code • Uncover problems with inefficient indexing – Work with rather than against SQL Server #SQLintheCityUK
  • 63.
    Thank you to… •Grant Fritchey - @GFritchey • Hugo Kornelis - @Hugo_Kornelis • Gail Shaw - @SQLintheWild • Rodney Landrum - @SQLBeat • Phil Factor - @Phil_Factor #SQLintheCityUK