Is this expected / documented behaviour?
It's only expected if you're familiar with the rules, but they've never been fully documented.
This is particularly the case for the legacy datetime and smalldatetime types, which have a number of quirks maintained for backward compatibility. The more modern types like datetime2 do not allow weirdness like adding numbers to dates or implicitly converting integers. Many of the quirks are the result of past questionable decisions.
If you want to avoid surprises, always be explicit about data types.
Type sloppiness is rife in the original code. Strings are not date/time types, they're strings. CAST cannot accept a style parameter (unlike CONVERT), which can result in ambiguity, non-deterministic results, or even runtime errors. Using DATEDIFF with a mixture of data types is just asking for trouble.
Example
Let's take Jonathan Fite's example because it is simpler than the original:
SELECT v.dt, UsingZero = DATEDIFF(DAY, 0, v.dt), ConvertToDateTime = DATEDIFF(DAY, CONVERT(datetime, 0), v.dt), MishMash = DATEDIFF(DAY, CONVERT(datetime, 0), CONVERT(datetime2(3), v.dt)) FROM ( VALUES ('2024-08-17 23:59:59.997'), ('2024-08-17 23:59:59.998'), ('2024-08-17 23:59:59.999'), ('2024-08-18 00:00:00.000') ) AS v (dt);
The items in the VALUES clause are strings. They might look like dates and times to a human, but they're not.
The expressions in the SELECT clause are at the mercy of the arcane rules for data type conversion performed by the DATEDIFF implementations (yes, there are several).
1. UsingZero = DATEDIFF(DAY, 0, v.dt)
Here we have DATEDIFF called with an integer and a varchar string as parameters.
SQL Server converts both the zero and the dt string to datetime, as you can see in the execution plan:
Expr1005 = datediff(day,'1900-01-01 00:00:00.000', CONVERT_IMPLICIT(datetime,[Union1004],0))
Union1004 is the label given to the varchar strings from the VALUES clause. The literal '1900-01-01 00:00:00.000' is the constant-folded result of converting zero to datetime. Notice the format of the displayed string is datetime-specific (it has three fractional decimals).
The implicit conversion of your strings to datetime is the important part to explain your rounding concern:
SELECT v.dt, AsDateTime = CONVERT(datetime, v.dt, 121) FROM ( VALUES -- These are strings ('2024-08-17 23:59:59.997'), ('2024-08-17 23:59:59.998'), ('2024-08-17 23:59:59.999'), ('2024-08-18 00:00:00.000') ) AS v (dt);
| dt | AsDateTime |
| 2024-08-17 23:59:59.997 | 2024-08-17 23:59:59.997 |
| 2024-08-17 23:59:59.998 | 2024-08-17 23:59:59.997 |
| 2024-08-17 23:59:59.999 | 2024-08-18 00:00:00.000 |
| 2024-08-18 00:00:00.000 | 2024-08-18 00:00:00.000 |
Notice the rounding due to the limited accuracy of datetime (1/300s).
It's not the zero that causes the rounding, it's the implicit conversion of your strings to datetime.
2. ConvertToDateTime = DATEDIFF(DAY, CONVERT(datetime, 0), v.dt)
Now we have DATEDIFF supplied with a datetime value and a varchar string.
SQL Server converts both to datetimeoffset, as you can see in the execution plan:
Expr1006 = datediff(day,'1900-01-01 00:00:00.000 +00:00', CONVERT_IMPLICIT(datetimeoffset(7),[Union1004],0))
The literal '1900-01-01 00:00:00.000 +00:00' is the constant-folded result of converting zero to datetimeoffset(3). The format of the displayed string has a timezone offset and three decimal places. Your string is implicitly converted to datetimeoffset(7).
No rounding occurs in this case because the zero maps to an exact date value and the strings are all exactly representable.
3. MishMash = DATEDIFF(DAY, CONVERT(datetime, 0), CONVERT(datetime2(3), v.dt))
Finally, we have DATEDIFF supplied with a datetime and the result of converting the varchar strings to datetime2(3) (with a default style).
The execution plan shows the explicit cast to datetime constant-folded to datetimeoffset(3) and displayed with the expected three decimals and a timezone offset. Your varchar strings are first explicitly cast to datetime2(3) as requested by the code, then implicitly converted to datettimeoffset(3):
Expr1007 = datediff(day,'1900-01-01 00:00:00.000 +00:00', CONVERT_IMPLICIT(datetimeoffset(3),CONVERT(datetime2(3),[Union1004],0),0))
That's probably not what was intended.
Being explicit
You didn't say what the data type of the source values was, but bearing in mind DATEDIFF uses datetimeoffset internally for most calculations we might as well choose that:
SELECT V.dt, Typed = DATEDIFF ( DAY, -- Explicit type CONVERT(datetimeoffset(3), '1900-01-01T00:00:00.000Z', 127), V.dt ) FROM ( VALUES -- Explicit types (CONVERT(datetimeoffset(3), '2024-08-17T23:59:59.997Z', 127)), (CONVERT(datetimeoffset(3), '2024-08-17T23:59:59.998Z', 127)), (CONVERT(datetimeoffset(3), '2024-08-17T23:59:59.999Z', 127)), (CONVERT(datetimeoffset(3), '2024-08-18T00:00:00.000Z', 127)) ) AS V (dt);
Results:
| dt | Typed |
| 2024-08-17 23:59:59.997 +00:00 | 45519 |
| 2024-08-17 23:59:59.998 +00:00 | 45519 |
| 2024-08-17 23:59:59.999 +00:00 | 45519 |
| 2024-08-18 00:00:00.000 +00:00 | 45520 |
The execution plan shows:
[Expr1005] = datediff(day,'1900-01-01 00:00:00.000 +00:00',[Union1004])
It's not as easy to see, but the [Union1004] values also have the correct type:
('2024-08-17 23:59:59.997 +00:00'), ('2024-08-17 23:59:59.998 +00:00'), ('2024-08-17 23:59:59.999 +00:00'), ('2024-08-18 00:00:00.000 +00:00')
You could choose datetime2(3) or whatever to match the source data type exactly, but there would then be an implicit conversion to datetimeoffset in the plan because that's what DATEDIFF uses internally in that case.
Advice
Choose an appropriate deterministic style for string date/time conversions, prefer CONVERT over CAST (which doesn't accept a style), and be explicit about your data types if you don't like confusing results and hard-to-debug situations in general.