Skip to main content
Tweeted twitter.com/StackDBAs/status/1034365055352102912
edited tags
Link
Paul White
  • 95.9k
  • 30
  • 442
  • 691
Source Link

Should I used varchar(max) or varchar(4000) SPARSE?

I have a "comment"-type column that is rarely used -- around 6% non-null in a population of 3 million records. The average length (when used) is 6 characters, and the max so far is around 3KB. A maximum of 4000 characters is reasonable for this field. I have two options:

comments varchar(max) NULL -- this is the current column definition comments varchar(4000) SPARSE NULL 

My current understanding is that in both cases, a NULL value would require no storage -- just the column's NULL bit set and a length of 0 in the row metadata.

But for the non-null cases, does one have a clear advantage over the other?

The extra 4-byte pointer for sparse columns with values suggests they are always stored off-row like text or very large varchar(max) fields. Is that the case?

If so, I'd lean toward using varchar(max), since it only stores values off-row if the total row length exceeds 8KB, and the majority of my values are short and unlikely to put a row over the limit.

I haven't seen this particular situation addressed in the BOL, so I'm hoping someone here knows enough about the innards of MSSQL to give some insight.

(If it matters, I'm currently using 2008R2, but hoping to upgrade soon to 2014.)