Do non-clustered (non-PK) indexes need to include clustered (PK) columns?

Question

For example, here are 2 indexes on FOO table:

ALTER TABLE [dbo].[FOO] ADD CONSTRAINT [PK_FOO] PRIMARY KEY CLUSTERED ([id] ASC) CREATE NONCLUSTERED INDEX [IX_FOO] ON [dbo].[FOO] ([id] ASC, [a] ASC, [b] ASC)

Many queries are using IX_FOO when filtering with a column. And it seems id column in IX_FOO is redundant since PK_FOO is indexing it. So I'm thinking to remove id column from IX_FOO like this:

CREATE NONCLUSTERED INDEX [IX_FOO2] ON [dbo].[FOO] ([a] ASC, [b] ASC) INCLUDE ([id])

But I'm not sure myself. Do indexes need to include PK columns?

@ryanwebjackson I'm thinking to remove id column from IX_FOO since PK_FOO already indexing it. But I'm not sure if this is a rational idea. — user2652379
– user2652379, Commented Jun 15, 2018 at 2:59

Vladimir Baranov · Accepted Answer · 2018-06-15 04:28:01Z

The order of columns in index is important.

Index on (ID, a, b) is very different to index on (a, b) include (id).

The first index on (ID, a, b) can be used when searching by:

ID, a, b: WHERE ID = 1 AND a = 2 AND b = 3
ID, a, range of b: WHERE ID = 1 AND a = 2 AND b > 3 AND b < 5
ID, a: WHERE ID = 1 AND a = 2
ID, range of a: WHERE ID = 1 AND a > 2 AND a < 5
ID: WHERE ID = 1
range of ID: WHERE ID > 1 AND ID < 5

This index will not be used when searching just for a or just for b: WHERE a = 2; WHERE b = 3.

Selected columns can be any combination of ID, a, b - if there are more columns, index would be not enough and engine would have to read them from the table.

The second index on (a, b) include (id) can be used when searching by:

a, b: WHERE a = 2 AND b = 3
a, range of b: WHERE a = 2 AND b > 3 AND b < 5
a: WHERE a = 2
range of a: WHERE a > 2 AND a < 5

This index will not be used when searching just for ID or just for b: WHERE ID = 1; WHERE b = 3.

Selected columns also can be any combination of ID, a, b - if there are more columns, index would be not enough and engine would have to read them from the table.

If there is a clustered index on ID, then there is no point adding ID as INCLUDE to non-clustered index, because the clustered column is included in each non-clustered index implicitly. This is why it is generally recommended to have narrow clustered index, usually 4-byte int. The wider the clustered index, the more space is needed for each non-clustered index.

So, index on (a, b) include (id) is the same as (a, b) in your case.

I'm not sure if engine is smart enough to not waste disk space if you INCLUDE ID explicitly. It is easy to check.

Question about WHERE ID = 1 AND a > 2 AND a < 5: Is index like IX_FOO necessary for this search? I thought PK_FOO would be sufficient since id is unique and checking a range of a is not expensive. Maybe there is a caveat?
@user2652379, You are right, I wasn't thinking about PK_FOO, I was focusing on the difference between two variants on the non-clustered index. With ID being unique it doesn't matter which index to use PK_FOO or IX_FOO. Taking into account uniqueness of ID it doesn't make sense to have IX_FOO defined as it is. It is pretty much useless. All searches listed above that can be done efficiently with IX_FOO can be done with the same efficiency using PK_FOO when ID is unique.

Chamika Goonetilaka · Accepted Answer · 2018-06-15 03:05:26Z

No you don't. Non Clustered indexes will be pointing to the primary key of the main table hence there is no need to add it (Primary Key) to any non clustered index.

In other words all non clustered indexes implicitly includes the clustered index of the table.

Collectives™ on Stack Overflow

Do non-clustered (non-PK) indexes need to include clustered (PK) columns?

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related