I have a query which involves a full-text search like this:
SELECT TOP 30 PersonId, PersonParentId, PersonName, PersonPostCode FROM dbo.People WHERE PersonDeletionDate IS NULL AND PersonCustomerId = 24 AND CONTAINS(ContactFullText, '"mr" AND "ch*"') AND PersonGroupId IN(197, 206, 186, 198) ORDER BY PersonParentId, PersonName;
This generates two main plans, one is very fast in all cases, the other is very slow in most cases.
I have experimented with this query such that the FT search is not included and what I found is that the row estimates are always way lower than they should be.
If I run update statistics...with fullscan I still see extremely inaccurate row estimates from NC index seek operations in the execution plan.
When the row estimates are low enough, a loop join is selected, which is normally very slow (30+ seconds). Higher estimates seem to produce a good plan involving a merge join instead of a loop join.
Why is SQL Server still not estimating the rowcounts despite still having up to date statistics?
The plan: https://www.brentozar.com/pastetheplan/?id=rkXtE0jzX
When I remove the CONTAINS part, thereby omitting the FullText search, the query is fast, but the row estimate for the index seek is still 1 estimated, 2195 actual.
On @Kin's advice, I used CONTAINSTABLE, which ran instantly and produced the following plan: https://www.brentozar.com/pastetheplan/?id=S1hKainzQ Interesting that there is no Full Text search operator.
Containstable requires RANK to produce the same result set in this case I've used AND RANK > 0 in the WHERE to produce the results I want, which produces this plan: https://www.brentozar.com/pastetheplan/?id=B1U7AA2zm
My only question now is about why row estimates are still inaccurate but I care less now that my FT queries seem significantly faster and more reliable. Very pleased! https://www.brentozar.com/pastetheplan/?id=B1U7AA2zm
@EvanCarroll stats histogram here: https://pastebin.com/p7s0NvX5