Uncovering SQL Server query problems with execution plans - Tony Davis

Uncovering SQL Server Query Problems with Execution Plans Tony Davis tony.davis@red-gate.com @tonytheeditor #SQLintheCityUK

About Me Books SQL Server transaction log SQL Server Source Control Speaker #SQLintheCityUK

SELECT ProductID , ProductNumber , dbo.LineItemTotal(ProductID) AS SumTotal FROM Production.Product p GO SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO Why Bother Learning Execution Plans? WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals;

From Query to Execution plan SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO #SQLintheCityUK

Plan 1 Plan 2 Plan 3 Plan 4 Plan 57 Plan n? Algebrizer (Query Binding) Syntax Checking (Query Parsing) Query Optimization Plan Query Execution Engine Plan Cache Metadata Table/index structures SELECT bth.ProductID , bth.TransactionDate , bth.Quantity , bth.ActualCost FROM dbo.bigTransactionHistory bth WHERE bth.TransactionDate >= '20100701' AND bth.TransactionDate <= CURRENT_TIMESTAMP ORDER BY bth.TransactionDate DESC; GO Index/Column Statistics Volume/distribution of data #SQLintheCityUK

Getting the Execution Plan .NET Code Profiler - e.g. ANTS Performance Profiler Get plans for previously executed queries from the plan cache - e.g. SQL Trace or Extended Events - Using sys.dm_exec_cached_plans Plans in cache contain no runtime information  #SQLintheCityUK

Getting the Execution Plan .NET Code Profiler - e.g. ANTS Performance Profiler Get plans for previously executed queries from the plan cache - e.g. SQL Trace or Extended Events - Using sys.dm_exec_cached_plans Plans in cache contain no runtime information  SET During testing, request the plan for a query • “estimated” plan – no runtime • “actual” plan - with runtime There is only 1 plan!  #SQLintheCityUK

Execution Plans for Developers • Don’t examine plan for every query • Gather stats, focus on critical, frequent, resource- intensive queries • Sometimes the code logic is just wrong – rip it up and start again! • Sometimes the basic logic is fine – But some subtler problem causes poor execution performance – This is where execution plans can help! #SQLintheCityUK

• With SQL Server: • We submit SQL describing the data set we want • The Optimizer decides how to execute it • Some developers bring imperative approach line- by-line control to SQL Server… Problem: Row-by-row strategy #SQLintheCityUK

Solution: set-based approach – Take the list to the supermarket ;) – Define single statement (if possible) to return required data set • Make as few ‘passes’ through the base tables as possible • Aggregate early • Reduce the working set to as few rows as possible #SQLintheCityUK

WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals; Classic Running Total Problem A Registrations table containing a list of subscribers, with the dates that the subscribers joined and left #SQLintheCityUK

WITH Totals AS ( SELECT DATEADD(m, u.theMonth, 0) AS TheMonth , SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) AS PeopleJoined , SUM(CASE WHEN u.theCol = 'DateLeft' THEN u.Registrations ELSE 0 END) AS PeopleLeft FROM ( SELECT DATEDIFF(MONTH, 0, DateJoined) AS DateJoined , DATEDIFF(MONTH, 0, DateLeft) AS DateLeft , COUNT(*) AS Registrations FROM dbo.Registrations2 GROUP BY DATEDIFF(MONTH, 0, DateJoined) , DATEDIFF(MONTH, 0, DateLeft) ) AS d UNPIVOT ( theMonth FOR theCol IN ( d.DateJoined, d.DateLeft ) ) AS u GROUP BY u.theMonth HAVING SUM(CASE WHEN u.theCol = 'DateJoined' THEN u.Registrations ELSE 0 END) > 0 ) SELECT TheMonth , PeopleJoined , PeopleLeft , SUM(PeopleJoined - PeopleLeft) OVER ( ORDER BY TheMonth ROWS UNBOUNDED PRECEDING ) AS CurrentSubscribers FROM Totals; A single ‘pass’ of the base table Early Aggregation 10000001690 Execution Plan #SQLintheCityUK

Problem: Excessive logical reads • “Read 1000 pages to return 100 rows” – 1 page = 1 logical read – Goal: return data with few logical reads as possible – Sometimes we force SQL Server to read more pages than necessary (excessive IO) Inefficient data access paths (indexes) Inefficient SQL forcing SQL Server to do excessive work  #SQLintheCityUK

Name (Index Key) A - Z + LocationID Name (Index Key) A - E + LocationID Name (Index Key) F - I + LocationID Name (Index Key) J - M + LocationID Name (Index Key) N - R + LocationID Name (Index Key) S - V + LocationID Name (Index Key) W - Z + LocationID Root Node Intermediate Level Leaf Level Name (Index Key) A - M + LocationID Name (Index Key) N - Z + LocationID WHERE l.Name ='Paint';Non-clustered Index #SQLintheCityUK

LocationID = 6; (ProductID, LocationID) (Clustering Key) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level(502,6) (4325,6) (10324,6)(7865,6) Clustered Index #SQLintheCityUK

LocationID = 6; (ProductID, LocationID) (Clustering Key) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level Clustered Index #SQLintheCityUK

LocationID = 6; (ProductID, LocationID) (Clustering Key) 1 to 12000 (ProductID, LocationID) 1-6000 (ProductID, LocationID) 6001-12000 Data rows for ProductID 1-2000 Data rows for ProductID 2001-4000 Data rows for ProductID 4001-6000 Data rows for ProductID 6001-8000 Data rows for ProductID 80001-10000 Data rows for ProductID 10001-12000 Root Node Intermediate Level Leaf Level #SQLintheCityUK

ProductID (Index Key) A - Z + SalesOrderID ProductID (Index Key) 1 - 150 + SalesOrderID ProductID (Index Key) 151 - 300 + SalesOrderID ProductID (Index Key) 301 - 450 + SalesOrderID ProductID (Index Key) 451 - 600 + SalesOrderID ProductID (Index Key) 601 - 750 + SalesOrderID ProductID (Index Key) 751 - 900 + SalesOrderID Root Node Intermediate Level Leaf Level ProductID (Index Key) A - M + SalesOrderID ProductID (Index Key) N - Z + SalesOrderID Lookup CarrierTrackingNumber in clustered index for each row Key Lookup SELECT ProductID , SalesOrderID , CarrierTrackingNumber FROM Sales.SalesOrderDetail WHERE ProductID = 709; #SQLintheCityUK

#SQLintheCityUK • For small tables, scanning an index is very efficient operation • BUT…for critical and frequently-executed queries: – Provide covering non-clustered indexes that the queries can seek – Minimize logical reads (IO) required to gather the data Solution: optimize indexes for workloadSolution: optimize indexes for workload

#SQLintheCityUK –Worst Strategy: No indexes. Query performance will be terrible –Second worst: Index on every column; same column participating in numerous indexes Data modification performance will be awful –Don’t do “SELECT *” Makes effective indexing VERY hard! Solution: optimize indexes for workloadSolution: optimize indexes for workload

#SQLintheCityUK • Don’t index query by query • Goal: small set of indexes to help most important and frequent queries • Analyze the workload as a whole – Profiler and Database Engine Tuning Advisor • Good start point… • Not necessarily good end point – Refine recommendations using Missing Index (and other) DMVs • What is the write : read ratio (>1?) for the index? • How many plans currently associated with index? • How often are those plans used? • How many reads might the missing index have helped? Don’t create if this number is small… – See Performance Tuning with SQL Server Dynamic Management Views (free eBook) • http://bit.ly/1jVG3IW Solution: optimize indexes for workload

“The explicit data type conversion has rendered the predicate non-SARGable!” Problem: Non-SARGable Predicates #SQLintheCityUK

Beware of SARG-ability • Non-SARGable means that the optimizer can’t use the expression in a seek operation • Cause: use of function directly on column in WHERE or JOIN – Various incarnations – All lead to excessive IO (scans when seeks should be viable) #SQLintheCityUK

SARG-able predicate …WHERE FirstName = 'bob';  SARGable FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... #SQLintheCityUK

Non SARGable predicate …WHERE REVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Aaron') = 'bob' ? #SQLintheCityUK

Non SARG-able predicate …WHERE REVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Aaron') = 'bob' ? #SQLintheCityUK

Non SARGable predicate …WHERE REVERSE(FirstName) = 'bob'; FirstName LastName Aaron Bertrand Aaron Darrenovsky Amber Smith Apple Paltrow Bob Duffy Bob Carolgees Chris Christofferson …etc... REVERSE ('Amber') = 'bob' ?  Non-SARGable #SQLintheCityUK

Other common explicit conversions -- Using functions like LTRIM, RTRIM etc. -- often happens where people don't trust consistency of data inputs SELECT bp.ProductID , bp.Name , bp.ProductNumber FROM dbo.bigProduct bp WHERE LTRIM(bp.ProductNumber) LIKE 'AR%'; GO SELECT bp.ProductID , bp.Name , bp.ProductNumber FROM dbo.bigProduct bp WHERE bp.ProductNumber LIKE N'%1000'; GO #SQLintheCityUK

Other causes of Non-SARG • Implicit Data type conversions – Variable data type doesn’t match column type – SQL Server uses CONVERT_IMPLICIT – Problem if column is lower precedence type • Data type precedence order: http://bit.ly/1ENOwjZ • Misuse of User Defined Functions (UDFs) #SQLintheCityUK

• Rewrite the query to avoid direct use of function on column • Use helper functions • Avoid data type mismatches • Be careful when using UDFs! – Convert scalar UDFs to inline TVFs – http://bit.ly/1oN9ysK Solutions to Non-SARGability #SQLintheCityUK

• One query size fits all • Leads to wildly inaccurate estimations and inappropriate execution plans Problem: Overly Generic SQL #SQLintheCityUK

• Write SQL and stored procedures for a specific, defined purpose • Various ‘workarounds’ though none without drawbacks – Recompile on each execution – Dynamic SQL – See Gail Shaw’s “How to Confuse the Query Optimizer” (http://bit.ly/1Mb8Rnp) Solutions to Generic SQL #SQLintheCityUK

Other issues that cause poor estimations • Stale statistics • Problems with parameter sniffing #SQLintheCityUK

Very wild estimation! #SQLintheCityUK

Conclusions • Plans tell us exactly how SQL Server executed your query – Focus on critical and frequently-executed queries – Use plans to: • spot common mistakes in the code • Uncover problems with inefficient indexing – Work with rather than against SQL Server #SQLintheCityUK

Thank you to… • Grant Fritchey - @GFritchey • Hugo Kornelis - @Hugo_Kornelis • Gail Shaw - @SQLintheWild • Rodney Landrum - @SQLBeat • Phil Factor - @Phil_Factor #SQLintheCityUK

Uncovering SQL Server query problems with execution plans - Tony Davis

More Related Content

What's hot

Similar to Uncovering SQL Server query problems with execution plans - Tony Davis

More from Red Gate Software

Recently uploaded

Uncovering SQL Server query problems with execution plans - Tony Davis