21

I was researching something else when I came across this thing. I was generating test tables with some data in it and running different queries to find out how different ways to write queries affects execution plan. Here is the script that I used to generate random test data:

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID('t') AND type in (N'U')) DROP TABLE t GO CREATE TABLE t ( c1 int IDENTITY(1,1) NOT NULL ,c2 int NULL ) GO insert into t select top 1000000 a from (select t1.number*2048 + t2.number a, newid() b from [master]..spt_values t1 cross join [master]..spt_values t2 where t1.[type] = 'P' and t2.[type] = 'P') a order by b GO update t set c2 = null where c2 < 2048 * 2048 / 10 GO CREATE CLUSTERED INDEX pk ON [t] (c1) GO CREATE NONCLUSTERED INDEX i ON t (c2) GO 

Now, given this data, I invoked the following query:

select * from t where c2 < 1048576 or c2 is null ; 

To my great surprise, the execution plan that was generated for this query, was this. (Sorry for the external link, it's too large to fit here).

Can someone explain to me what's up with all these "Constant Scans" and "Compute Scalars"? What's happening?

Plan

 |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1010], [Expr1011], [Expr1012])) |--Merge Interval | |--Sort(TOP 2, ORDER BY:([Expr1013] DESC, [Expr1014] ASC, [Expr1010] ASC, [Expr1015] DESC)) | |--Compute Scalar(DEFINE:([Expr1013]=((4)&[Expr1012]) = (4) AND NULL = [Expr1010], [Expr1014]=(4)&[Expr1012], [Expr1015]=(16)&[Expr1012])) | |--Concatenation | |--Compute Scalar(DEFINE:([Expr1005]=NULL, [Expr1006]=NULL, [Expr1004]=(60))) | | |--Constant Scan | |--Compute Scalar(DEFINE:([Expr1008]=NULL, [Expr1009]=(1048576), [Expr1007]=(10))) | |--Constant Scan |--Index Seek(OBJECT:([t].[i]), SEEK:([t].[c2] > [Expr1010] AND [t].[c2] < [Expr1011]) ORDERED FORWARD) 
0

2 Answers 2

31

The constant scans each produce a single in-memory row with no columns. The top compute scalar outputs a single row with 3 columns

Expr1005 Expr1006 Expr1004 ----------- ----------- ----------- NULL NULL 60 

The bottom compute scalar outputs a single row with 3 columns

Expr1008 Expr1009 Expr1007 ----------- ----------- ----------- NULL 1048576 10 

The concatenation operator Unions these 2 rows together and outputs the 3 columns but they are now renamed

Expr1010 Expr1011 Expr1012 ----------- ----------- ----------- NULL NULL 60 NULL 1048576 10 

The Expr1012 column is a set of flags used internally to define certain seek properties for the Storage Engine.

The next compute scalar along outputs 2 rows

Expr1010 Expr1011 Expr1012 Expr1013 Expr1014 Expr1015 ----------- ----------- ----------- ----------- ----------- ----------- NULL NULL 60 True 4 16 NULL 1048576 10 False 0 0 

The last three columns are defined as follows and are just used for sorting purposes prior to presenting to the Merge Interval Operator

[Expr1013] = Scalar Operator(((4)&[Expr1012]) = (4) AND NULL = [Expr1010]), [Expr1014] = Scalar Operator((4)&[Expr1012]), [Expr1015] = Scalar Operator((16)&[Expr1012]) 

Expr1014 and Expr1015 just test whether certain bits are on in the flag. Expr1013 appears to return a boolean column true if both the bit for 4 is on and Expr1010 is NULL.

From trying other comparison operators in the query I get these results

+----------+----------+----------+-------------+----+----+---+---+---+---+ | Operator | Expr1010 | Expr1011 | Flags (Dec) | Flags (Bin) | | | | | | 32 | 16 | 8 | 4 | 2 | 1 | +----------+----------+----------+-------------+----+----+---+---+---+---+ | > | 1048576 | NULL | 6 | 0 | 0 | 0 | 1 | 1 | 0 | | >= | 1048576 | NULL | 22 | 0 | 1 | 0 | 1 | 1 | 0 | | <= | NULL | 1048576 | 42 | 1 | 0 | 1 | 0 | 1 | 0 | | < | NULL | 1048576 | 10 | 0 | 0 | 1 | 0 | 1 | 0 | | = | 1048576 | 1048576 | 62 | 1 | 1 | 1 | 1 | 1 | 0 | | IS NULL | NULL | NULL | 60 | 1 | 1 | 1 | 1 | 0 | 0 | +----------+----------+----------+-------------+----+----+---+---+---+---+ 

From which I infer that Bit 4 means "Has start of range" (as opposed to being unbounded) and Bit 16 means the start of the range is inclusive.

This 6 column result set is emitted from the SORT operator sorted by Expr1013 DESC, Expr1014 ASC, Expr1010 ASC, Expr1015 DESC. Assuming True is represented by 1 and False by 0 the previously represented resultset is already in that order.

Based on my previous assumptions the net effect of this sort is to present the ranges to the merge interval in the following order

 ORDER BY HasStartOfRangeAndItIsNullFirst, HasUnboundedStartOfRangeFirst, StartOfRange, StartOfRangeIsInclusiveFirst 

The merge interval operator outputs 2 rows

Expr1010 Expr1011 Expr1012 ----------- ----------- ----------- NULL NULL 60 NULL 1048576 10 

For each row emitted a range seek is performed

Seek Keys[1]: Start:[dbo].[t].c2 > Scalar Operator([Expr1010]), End: [dbo].[t].c2 < Scalar Operator([Expr1011]) 

So it would appear as though two seeks are performed. One apparently > NULL AND < NULL and one > NULL AND < 1048576. However the flags that get passed in appear to modify this to IS NULL and < 1048576 respectively. Hopefully @sqlkiwi can clarify this and correct any inaccuracies!

If you change the query slightly to

select * from t where c2 > 1048576 or c2 = 0 ; 

Then the plan looks much simpler with an index seek with multiple seek predicates.

The plan shows Seek Keys

Start: c2 >= 0, End: c2 <= 0, Start: c2 > 1048576 

The explanation for why this simpler plan cannot be used for the case in the OP is given by SQLKiwi in the comments to the earlier linked blog post.

An index seek with multiple predicates cannot mix different types of comparison predicate (ie. Is and Eq in the case in the OP). This is just a current limitation of the product (and is presumably the reason why the equality test in the last query c2 = 0 is implemented using >= and <= rather than just the straightforward equality seek you get for the query c2 = 0 OR c2 = 1048576.

4
  • I can't spot anything in Paul's article that explains the difference in the flags for [Expr1012]. Can you deduce what the 60/10 signifies here? Commented Mar 12, 2012 at 12:39
  • @MarkStorey-Smith - he says 62 is for an equality comparison. I guess 60 must mean that instead of > AND < as shown in the plan you in fact get >= AND <= unless it is an explicit IS NULL flag maybe(?) or maybe the bit 2 indicates something else unrelated and 60 is still equality as when I do set ansi_nulls off and change it to c2 = null it still stays at 60 Commented Mar 12, 2012 at 12:50
  • 2
    @MartinSmith 60 is indeed for a comparison with NULL. The range boundary expressions use NULL to represent 'unbounded' at either end. The seek is always exclusive i.e. seek Start: > Expr & End: < Expr rather than inclusive using >= and <=. Thanks for the blog comment, I'll post an answer or a longer comment in reply in the morning (too late to do it justice right now). Commented Mar 12, 2012 at 14:59
  • @SQLKiwi - Thanks. That makes sense. Hopefully I will have figured out some of the missing bits before then. Commented Mar 12, 2012 at 15:40
13

The constant scans are a way for SQL Server to create a bucket into which it's going to place something later in the execution plan. I've posted a more thorough explanation of it here. To understand what the constant scan is for, you have to look further into the plan. In this case, it's the Compute Scalar operators that are being used to populate the space created by the constant scan.

The Compute Scalar operators are being loaded up with NULL and the value 1045876, so they're clearly going to be used with the Loop Join in an effort to filter the data.

The really cool part is that this plan is Trivial. It means that it went through a minimal optimization process. All the operations are leading up to the Merge Interval. This is used to create a minimal set of comparison operators for an index seek (details on that here).

The whole idea is to get rid of overlapping values so that it can then pull the data out with minimal passes. Although it's still using a loop operation, you'll note that the loop executes exactly once, meaning, it's effectively a scan.

ADDENDUM: That last sentence is off. There were two seeks. I misread the plan. The rest of the concepts are the same and the goal, minimal passes, is the same.

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.