I was able to reproduce a query performance issue that I would describe as unexpected. I'm looking for an answer that's focused on internals.
On my machine, the following query does a clustered index scan and takes about 6.8 seconds of CPU time:
SELECT ID1, ID2 FROM two_col_key_test WITH (FORCESCAN) WHERE ID1 NOT IN ( N'1', N'2',N'3', N'4', N'5', N'6', N'7', N'8', N'9', N'10', N'11', N'12',N'13', N'14', N'15', N'16', N'17', N'18', N'19', N'20' ) AND (ID1 = N'FILLER TEXT' AND ID2 >= N'' OR (ID1 > N'FILLER TEXT')) ORDER BY ID1, ID2 OFFSET 12000000 ROWS FETCH FIRST 1 ROW ONLY OPTION (MAXDOP 1); The following query does a clustered index seek (only difference is removing the FORCESCAN hint) but takes about 18.2 seconds of CPU time:
SELECT ID1, ID2 FROM two_col_key_test WHERE ID1 NOT IN ( N'1', N'2',N'3', N'4', N'5', N'6', N'7', N'8', N'9', N'10', N'11', N'12',N'13', N'14', N'15', N'16', N'17', N'18', N'19', N'20' ) AND (ID1 = N'FILLER TEXT' AND ID2 >= N'' OR (ID1 > N'FILLER TEXT')) ORDER BY ID1, ID2 OFFSET 12000000 ROWS FETCH FIRST 1 ROW ONLY OPTION (MAXDOP 1); The query plans are pretty similar. For both queries there are 120000001 rows read from the clustered index:
I am on SQL Server 2017 CU 10. Here is code to create and populate the two_col_key_test table:
drop table if exists dbo.two_col_key_test; CREATE TABLE dbo.two_col_key_test ( ID1 NVARCHAR(50) NOT NULL, ID2 NVARCHAR(50) NOT NULL, FILLER NVARCHAR(50), PRIMARY KEY (ID1, ID2) ); DROP TABLE IF EXISTS #t; SELECT TOP (4000) 0 ID INTO #t FROM master..spt_values t1 CROSS JOIN master..spt_values t2 OPTION (MAXDOP 1); INSERT INTO dbo.two_col_key_test WITH (TABLOCK) SELECT N'FILLER TEXT' + CASE WHEN ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) > 8000000 THEN N' 2' ELSE N'' END , ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) , NULL FROM #t t1 CROSS JOIN #t t2; I am hoping for an answer that does more than call stack reporting. For example, I can see that sqlmin!TCValSSInRowExprFilter<231,0,0>::GetDataX takes significantly more CPU cycles in the slow query compared to the fast one:
Instead of stopping there, I'd like to understand what that is and why there's such a large difference between the two queries.
Why is there a large difference in CPU time for these two queries?





