9

I am currently facing an issue with parameterized queries in SQL Server that I do not understand where it is rooted in.

I broke it down to a simple example:

Let's assume a table that holds data about some child entity as well as the parent_id and a corresponding index on the parent_id. The data is accessed based on this parent_id but through a view that, additionally to the table data, holds a column calculating a row_number over all entries partitioned by the parent_id.

Reproducable setup

Create the table, index and view as follows:

CREATE TABLE dbo.test (id BIGINT IDENTITY(1,1), text NVARCHAR(255), parent_id BIGINT); GO CREATE NONCLUSTERED INDEX idx_test_parent_id ON dbo.test (parent_id); GO CREATE VIEW dbo.test_view AS SELECT *, ROW_NUMBER() OVER (PARTITION BY parent_id ORDER BY id) AS row_num FROM dbo.test GO 

Now get some data into the table:

DECLARE @i BIGINT = 0 WHILE @i < 200000 BEGIN SET @i = @i + 1 INSERT INTO dbo.test (text, parent_id) VALUES ('test 1', @i), ('test 2', @i), ('test 3', @i); END 

The issue

When accessing the data through a parameterized query from the view the SQL Server will do a full scan on the table.

DECLARE @parent_id BIGINT = 123456 SELECT * FROM dbo.test_view WHERE parent_id = @parent_id 

query plan full table scan

While when accessing the data directly (without using a parameter) we will get the expected index seek.

SELECT * FROM dbo.test_view WHERE parent_id = 123456 

query plan index seek

What I have tried

Searching different forums, I do not really understand what is happening here. I have found similar issues where the parameter had the wrong data type and thus performance was bad, but this is not an issue in my case. I also read about issues with parameter sniffing but I neither think this is a problem here as I do not access data through stored procedures or functions.

Also, when I am accessing the data directly from the table with a parameterized query the issue will not occur. An index seek is done even with the parameters.

Same happens when I add the OPTION (RECOMPILE) to the query accessing the view with a parameterized query the SQL Server will end up doing an index seek.

Question

Can someone explain what the issue is here? How come that this is an issue for the view but not for the table itself? Do I really need to get rid of the view calculating this row_number differently during inserts/deletes?

Setup

  • SQL Server 2022 v16.0.4165 running in a docker container
  • Docker image: mcr.microsoft.com/mssql/server:2022-latest

The real table has a primary key of course. But it also has a lot more columns then only the text column. Including all of these columns in the index would be a possibility. The issue though is not occurring when selecting from the table itself, so it seems not to be an issue of the index to me.

I was not aware that I am running the database in a compatibility mode. In the productive environment I am even getting CardinalityEstimationModelVersion="140". I do not think that I have set it up anywhere on purpose.

Execution plans

1
  • A question - these screens are from which tool? Commented Mar 4 at 19:33

3 Answers 3

13

Your execution plan shows that you are running at CardinalityEstimationModelVersion 150 (SQL Server 2019 equivalent).

And you say your production plan is using 140 (SQL Server 2017)

I can also reproduce this on SQL Server 2022 by setting database compatibility level of 140 or 150.

It looks like you are hitting the SelOnSeqPrj issue described in this Stack Overflow answer and The Problem with Window Functions and Views, both by Paul White.

The issue goes away at compatibility level (CL) 140 and 150 when ALTER DATABASE SCOPED CONFIGURATION SET QUERY_OPTIMIZER_HOTFIXES = ON (which makes sense as it was fixed in CU30 for SQL Server 2017 and CU17 for SQL Server 2019) and it does not repro at CL 160 irrespective of that configuration option.

enter image description here

If you are on version 2022 and are only using an older CL unintentionally then you should consider changing this to get the latest and greatest functionality and the fix for this issue as default - without needing to enable QUERY_OPTIMIZER_HOTFIXES.

Changing compat level does need testing though as there are sometimes breaking behavioural changes between levels and different cardinality estimation models can affect execution plans either positively or negatively. You can use query store to help mitigate the risk for this second issue.

2
  • 1
    140 is SQL Server 2017. This is prob as the CL of the database is 2017. Typically this will be because the DB was originally created on that product version and the CL was never changed following a version upgrade (or restoring/attaching to a higher version). See learn.microsoft.com/en-us/sql/t-sql/statements/… for more about this. And this article for managing risk for changing it learn.microsoft.com/en-us/sql/relational-databases/performance/… Commented Feb 6 at 9:19
  • 2
    Thanks for the explanation and the links. I will have a look into pros and cons of upgrading. Commented Feb 6 at 12:06
3

You're using a local variable. A local variable, unlike a hard coded value, or a parameter in a parameterized query (prepared statement or stored procedure), doesn't use the value provided to look at the statistics when compiling the plan. Instead, it uses an average of the values in the statistics to come up with the plan. EXCEPT, when there's a recompile situation. Then, the value in the local variable can be used to look at the specifics of the statistics, not averages, to arrive at a different row count.

That pretty much describes all the behaviors you're seeing. Test it by creating a stored procedure out of your query and then passing in the same value.

Now, that said, the converse of this is that sampling a specific value, also known as parameter sniffing, can get you a more accurate row count and therefore a better execution plan. Until, you find that some parameters are returning more rows (or less) than the specified one that was used to create the plan. Then, performance can stink because you either need a plan based on an average of the rows, or, you need specific plans for each possible value. This is where you'll find yourself dealing with query hints OPTIMIZE FOR or OPTIMIZE FOR UNKNOWN or adding RECOMPILE hints to get specific plans ever time. SQL Server 2022 and up even has what is called parameter sensitive plan optimization to help deal with this scenario.

In short, this all gets difficult.

0
1

While you have identified the parameter sniffing issue, the real solution probably isn't to add hints to your query. Parameter sniffing problems happen when you give the server a choice between a bad plan and an even worse plan, and it doesn't choose the right one. Instead you need to give it such a good choice that it always chooses the correct plan.

So you need to improve the indexing. First, you are missing a Primary Key, which is a big no-no in a properly normalized database.

ALTER TABLE dbo.test ADD PRIMARY KEY (id); 

Then, to fix your actual issue, you need to "cover" the query columns with the index. So you need INCLUDE columns.

CREATE NONCLUSTERED INDEX idx_test_parent_id ON dbo.test (parent_id, id) INCLUDE (text) WITH (DROP_EXISTING = ON); 

Now your query plan is nice and neat, even with the local variable.

query plan

db<>fiddle

1
  • 2
    I have directly tried it in my small example case and adding the include did not change anything for me. The query is still doing a full table scan. But I also see that even before the change of the index in your fiddle the SQL server already did an index seek plus an additional RID lookup which would be already good enough in my case. But I end up with a full table scan instead. Commented Feb 5 at 16:19

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.