Why does an inner join produce an index scan vs an index seek

Question

I have two tables

Account: AccountUID bigint (Primary Key) AccountID bigint Version smallint CustomerName varchar(50)

Note: AccountID and CurrentVersion are also part of a key that is unique

AccountStatus: AccountID bigint (Primary Key) CurrentVersion smallint Filter1 varchar(10) Filter2 varchar(10))

To keep things simple, I currently have 100 rows in each as each AccountID only has one version. I have values in the Filter1 and Filter2 tables such that when I do the following I get 10 records returned:

SELECT * FROM AccountStatus acs WHERE acs.Filter1=SomeValue1 and acs.Filter2=SomeValue2

Due to the fact that I have an index that has the Filter1 and Filter2 in it, the actual execution plan shows an Index Seek with only the 10 selected rows in the Actual Rows value.

When I join in the Account table as follows, I get the same 10 records:

SELECT acs.* FROM AccountStatus acs INNER JOIN Account a ON acs.AccountID=a.AccountID AND acs.CurrentVersion=a.Version WHERE acs.Filter1=SomeValue1 and acs.Filter2=SomeValue2

However, when I look at the Actual Execution Plan, I still see the Index Seek on the AccountStatus table as before with 10 Actual Rows. However, above that I see an Index Scan on the index that involves the AccountID and Version of the Account table. Also, this "action" shows 100 in the Actual Rows.

Here is the detail of the indexes involved:

CREATE NONCLUSTERED INDEX [IX_Find] ON [dbo].[AccountStatus] ( [Filter1] ASC, [Filter2] ASC ) INCLUDE ( [AccountID], [CurrentVersion] ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] CREATE UNIQUE NONCLUSTERED INDEX [IX_Account_Status] ON [dbo].[Account] ( [AccountID] ASC, [Version] ASC ) INCLUDE ( [AccountUID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

While this is not too much of a performance hit with 100 rows, I am more concerned when I get to a million or several million rows. This will become very inefficient.

Is there anyway to make this type of query not scan all of the rows in the Index on the Account table?

Any help will be most appreciated.

Show the full definition of the index where AccountID and CurrentVersion are also part of a key that is unique. What part they are of that index matters quite a lot. — hatchet - done with SOverflow
– hatchet - done with SOverflow, Commented Aug 10, 2013 at 15:27
I edited the question to include the detail about the indexes. Also, I have noticed that if I limit my row count on the query with the inner join to say 50 it works as expected. Anything over that produces the results described in the question. — MSLive Tech
– MSLive Tech, Commented Aug 23, 2013 at 5:56

xenapan · Accepted Answer · 2014-04-30 21:19:42Z

Execution plans are based off the statistics of the tables and their indexed columns. With only 100 rows, it will have a different execution plan than when it has 10m rows. If you are worried about it not using the index you need to increase the amount of data and update your statistics before looking at the execution plan again. The engine is pretty smart and will very often choose the fastest version of the execution plan.

I assume filter 1 matches for a fair amount of rows, regardless, it doesn't take the engine much longer to perform a scan over a seek with this few rows so the engine just uses the scan.

Kyle Wormsbacher · Accepted Answer · 2013-08-29 18:55:38Z

The way I determine if an index is needed or not is one of two ways. The first is execute the query using Display Estimated Execution Plan. Then if the execution plan shows that an index is missing (in green) right click the green missing index message and select Missing Index Details... This will open a new window with code to insert the missing index that can help this query. Simply uncomment the lower section of the code and replace with the name for the new index and execute the code.

You may need to repeat this process a few times for any missing indexes that may help this code.

The second method is more of a long term approach to database performance.

PINAL DAVE has published a series of scripts on his blog at http://blog.sqlauthority.com/ that help identify missing indexes as well as unused and duplicate indexes. I run these on a regular basis to determine what indexes will help and which are actually hurting performance.

Add the missing indexes and remove any unused or duplicate indexes.

Regarding the first part of your answer, I'd be cautious of adding more indexes to a table just because the query optimiser suggests it for the query you are executing now, with the data you have now. The addition of indexes should take in to account the type of table it is (e.g. more or less reads than writes) and needs to be monitored as your database grows. If it were this simple you might as well add indexes to every field in every table.

Collectives™ on Stack Overflow

Why does an inner join produce an index scan vs an index seek

2 Answers 2

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Related