19

Can anyone explain the below situation, where two seemingly equal values are not reduced by DISTINCT?

screenshot showing two values for 'SBS_UCS_DISPATCH'

The query above is SELECT DISTINCT name FROM master.sys.dm_os_spinlock_stats where name = 'SBS_UCS_DISPATCH';

The equivalent SELECT name FROM master.sys.dm_os_spinlock_stats where name = 'SBS_UCS_DISPATCH' GROUP BY name; also does the same, and adding HAVING COUNT(1) > 1 does not yield the rows.

@@VERSION is Microsoft SQL Server 2019 (RTM-CU13) (KB5005679) - 15.0.4178.1 (X64) Sep 23 2021 16:47:49 Copyright (C) 2019 Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: )

2
  • 2
    I suspect @MartinSmith is right about bad internal meta-data so this looks to be a bug. FWIW, the same occurs with SQL 2022 and Azure SQL Database. The underlying internal view query is SELECT * FROM OpenRowSet(TABLE DM_OS_SPINLOCKSTATS). Commented Aug 3, 2023 at 15:41
  • Replicated on Microsoft SQL Server 2019 (RTM-CU21) (KB5025808) - 15.0.4316.3 (X64) Jun 1 2023 16:32:31 Copyright (C) 2019 Microsoft Corporation Developer Edition (64-bit) on Windows Server 2019 Datacenter 10.0 <X64> (Build 17763: ) (Hypervisor) . Commented Aug 4, 2023 at 11:21

1 Answer 1

16

It looks like the DISTINCT gets optimised out during simplification.

select distinct name FROM master.sys.dm_os_spinlock_stats OPTION (QUERYTRACEON 8606,QUERYTRACEON 3604) 

Gives

*** Input Tree: *** LogOp_GbAgg OUT(QCOL: DM_OS_SPINLOCKSTATS.name,) BY(QCOL: DM_OS_SPINLOCKSTATS.name,) LogOp_Project LogOp_Project QCOL: DM_OS_SPINLOCKSTATS.name LogOp_Project LogOp_StreamingTabUdfRESULT(QCOL: DM_OS_SPINLOCKSTATS.name QCOL: DM_OS_SPINLOCKSTATS.collisions QCOL: DM_OS_SPINLOCKSTATS.spins QCOL: DM_OS_SPINLOCKSTATS.spins_per_collision QCOL: DM_OS_SPINLOCKSTATS.sleep_time QCOL: DM_OS_SPINLOCKSTATS.backoffs) AncOp_PrjList AncOp_PrjList AncOp_PrjList AncOp_PrjList ******************* 

Then the GbAgg disappears in the Simplified Tree

*** Simplified Tree: *** LogOp_StreamingTabUdfRESULT(QCOL: DM_OS_SPINLOCKSTATS.name QCOL: DM_OS_SPINLOCKSTATS.collisions QCOL: DM_OS_SPINLOCKSTATS.spins QCOL: DM_OS_SPINLOCKSTATS.spins_per_collision QCOL: DM_OS_SPINLOCKSTATS.sleep_time QCOL: DM_OS_SPINLOCKSTATS.backoffs) 

enter image description here

I assume that there is some metadata indicating name is unique despite the query results to the contrary but I can't see how to prove that.

One way of working around this is the below.

SELECT DISTINCT name COLLATE Latin1_General_100_CI_AS FROM master.sys.dm_os_spinlock_stats WHERE name COLLATE Latin1_General_100_CI_AS = 'SBS_UCS_DISPATCH' 

In general even if name was unique under one collation then it could still have duplicates with different collate semantics. The original column collation for me is Latin1_General_CI_AS so this doesn't change the case insensitive accent sensitive nature but it is enough to preserve the GbAgg in the plan.

enter image description here

SELECT ca.name, COUNT(*) AS Count, SERVERPROPERTY('ProductVersion') AS [version] FROM master.sys.dm_os_spinlock_stats CROSS APPLY (VALUES(name COLLATE Latin1_General_100_BIN)) ca(name) GROUP BY ca.name HAVING COUNT(*) > 1 

Returns the following on my dev instance

name Count version
LOGPOOL_FREEBUFMGR 2 16.0.1050.5
SBS_UCS_DISPATCH 2 16.0.1050.5

So SBS_UCS_DISPATCH is not alone in being duplicated.

For me the values in the other columns are all 0 for these. So I'm unclear if the entire rows will be duplicated or not in the event that these spin locks are encountered - or whether they would need to be aggregated.

The affected types aren't ones that are documented other than "for internal use" anyway though.

3
  • So I'm unclear if the entire rows will be duplicated or not in the event that these spin locks are encountered -- I encountered it originally when I was trying to INSERT/SELECT into a history table where I had made name part of the PK. I figured I would aggregate with SUM/GROUP BY name and it still happened, so I reduced it to the most straightforward example (i.e. SELECT DISTINCT) Commented Aug 4, 2023 at 18:20
  • The affected types aren't ones that are documented other than "for internal use" anyway though. -- INSERT/SELECT cares not for your "internal use" :-) Commented Aug 4, 2023 at 18:22
  • In any case, binary-collating the column is more than sufficient for this case -- I was more surprised that there was a property of equality that was being applied in WHERE that was not being applied in DISTINCT, GROUP BY, etc. Commented Aug 4, 2023 at 18:30

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.