Understanding how primary key columns are included in a non-clustered index

Question

Assume I have a table called 'demo' with 4 columns; 'a', 'b', 'c' and 'd'. The primary key clustered index for the 'demo' table contains columns 'a' and 'b' in that order.

The 'Actual Execution Plan' from a query referencing table 'demo' has suggested that a new non-unique non-clustered index is required for column 'b' and should include column 'a'.

If I create a non-unique non-clustered index on column 'b' do I need to include column 'a' or will it already be part of the non-clustered index because it is in the primary key?

If primary key column 'a' is already part of the non-clustered index, is column 'a' stored as an include column or is it part of the non-clustered key?

The column will be included -- not because it's in the primary key, but because it's in the clustered index key. The index supporting the primary key constraint just happens to also be the clustered index in your case. (Which, admittedly, is the most common case, but it's worth making explicit.) For the rest, see here. — Jeroen Mostert
– Jeroen Mostert, Commented Sep 12, 2017 at 13:23

Community · Accepted Answer · 2020-06-20 09:12:55Z

The 'Actual Execution Plan' from a query referencing table 'demo' has suggested that a new non-unique non-clustered index is required for column 'b' and should include column 'a'.

...

If primary key column 'a' is already part of the non-clustered index, is column 'a' stored as an include column or is it part of the non-clustered key?

In your case column a will be presented on all levels of non-clustered index as the part of clustered index key. The index suggested to you is non-unique so it needs uniquefier and the clustered index key will be used for this purpose.

If the offered index was unique, column a would be stored on the leaf level of this index as the part of row locator that in case of a clustered table is clustered index key.

Column a will not be stored twice if you include it explicitly as included column of your index, so I advice you to include it. It will make difference when one day someone decides to turn your clustered table to a heap (by dropping clustered index). In this case if you did not include column a explicitly in your non clustered index, it will be lost and not contained in your non-clustered index anymore

Radim Bača · Accepted Answer · 2017-09-12 13:26:24Z

Including the column a in non-clustered index is useless since it is part of the clustered index key. Therefore, it is part of the data in leaf pages of non-clustered index. Having a query like this

SELECT a FROM tab WHERE b = <value>

then the a value will be naturally part of the leaf data in the non-clustered index.

Since the NCI on (b) is not unique, a will be a key column in the index, so not only present on the leaf pages. See sqlblog.com/blogs/kalen_delaney/archive/2010/03/07/…

Eli · Accepted Answer · 2017-09-12 13:26:46Z

The PK fields are always part of the key of the index, not part of the included columns.

What I'm thinking here is perhaps it wants to seek by column B; that's something that it can only do if column B is the first key in the index. If you define an index with column B first, followed by column A, perhaps it'll be able to do just that. It seems that it'll be happy as long as both keys are in the index, as you have a compound PK, though they may currently be in the wrong order (first A, then B) thereby preventing a seek.

Reference on PK fields automatically showing up in indexes: https://www.brentozar.com/archive/2013/07/how-to-find-secret-columns-in-nonclustered-indexes/

"The PK fields are always part of the key of the index, not part of the included columns" Not quite, rather "Clustered Index columns are stored as key columns for non-unique NCIs, and included columns for unique NCIs". See sqlblog.com/blogs/kalen_delaney/archive/2010/03/07/…

etsa · Accepted Answer · 2017-09-12 13:30:53Z

Try this and watch execution plan. You can see DB uses only INDEX. So, as far as I know, you shouldn't include column A in your index (as, as you said, Clust. index key is already included).

CREATE TABLE DEMO (COLA VARCHAR(10) NOT NULL, COLB VARCHAR(10) NOT NULL, COLC VARCHAR(10), COLD VARCHAR(10)); ALTER TABLE DEMO ADD CONSTRAINT DEMO_PK PRIMARY KEY (COLA, COLB); CREATE INDEX DEMO_IX1 ON DEMO (COLB); INSERT INTO DEMO VALUES ('A','B','C','D'); INSERT INTO DEMO VALUES ('A1','B1','C1','D1'); INSERT INTO DEMO VALUES ('A2','B2','C2','D2'); SELECT COLA,COLB FROM DEMO WHERE COLB='B1'

Rigerta · Accepted Answer · 2017-09-12 13:31:08Z

Non-clustered indexes implicitly include the clustered index keys automatically. In the documentation you could get a lot of information about this, but especially this part explains exactly this:

Nonclustered Index Architecture

The leaf layer of a nonclustered index is made up of index pages instead of data pages. The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key for a row.

If your table is a heap, then the row locator would point directly to the data row that contains the key value but if your table is not a heap (which is the case, because you have already a clustered key on that table) then the row locator points to the clustered index key.

Take a look at clustered and nonclustered indexes described as well.

This thread discusses the same: Necessary to include clustered index columns in non-clustered indexes?

Collectives™ on Stack Overflow

Understanding how primary key columns are included in a non-clustered index

5 Answers 5

Comments

1 Comment

1 Comment

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

1 Comment

1 Comment

Comments

Comments

Related