1

Suppose, I have a composite primary key that consists of 3 columns: [ShardKey], [SiteId] and [ServiceId]. Column [ServiceId] is an identity column and should start from 1 for a new combination of [ShardKey] and [SiteId], but SQL server fails - the column [ServiceId] never starts from 1, it just increments its value no matter what the primary key is. I did add constraint for primary key that holds all three columns. What should I do to make it start from 1 when a new combination of [ShardKey] and [SiteId] comes in?

[ShardKey] [SiteId] [ServiceId] [Name] 1009 1 1 Coffee 1009 1 2 Tea 1009 1 3 Cocaine 1009 2 1 Coffee 1009 2 2 Tea 1009 2 3 Cocaine 1010 1 1 Coffee 1010 1 2 Tea 1010 1 3 Cocaine 

That's what I want SQL server to do for me. Values for [ShardKey] and [SiteId] are known to me, so I can always insert them into SQL script, but I need the column [ServiceId] to start from 1 for a new combination of [ShardKey] and [SiteId]

9
  • 2
    That isn't what an IDENTITY does. an IDENTITY increments every time an INSERT is attempted. It doesn't care about the values of the other columns. Commented Jan 21, 2020 at 11:07
  • 1
    Sounds like what you want is multiple SEQUENCE objects (one for each combination of ShardKey and SiteId?) and then use the appropriate SEQUENCE. Of course, if you have a lot of different values for ShardKey and SiteId, that isn't a maintainable solution. Commented Jan 21, 2020 at 11:08
  • When a new combination of [ShardKey] and [SiteId] is encountered, it should increment [ServiceId] column. Columns [ShardKey] and [SiteId] are not identity columns, they have a known value, but the [ServiceId]. I need it for upsert operations and it should automatically increment in context of combination: [ShardKey] and [SiteId] Commented Jan 21, 2020 at 11:14
  • 1
    OK, I will add example in the major post Commented Jan 21, 2020 at 11:17
  • as a side note: you may also want to think about physical page shuffling; a composite primary key like this might be a good candidate for leaving the primary key nonclustered (so inserts don't cause shuffle), with it either remaining a heap (no clustered index), or using some other clustered index where that makes sense (for example: CreationDate might be a good candidate for a clustered index here); conversely; it might be that you're happy to pay the insert/shuffle cost of clustering on the PK, in order to get the benefits of reduced page fetches when fetching all for shard/shard+site Commented Jan 21, 2020 at 11:23

2 Answers 2

3

The short answer is "no, IDENTITY can't help you here" - IDENTITY is per-table, not per-unique-other-primary-key-components. So: you'd have to handle this yourself (presumably checking the maximum ServiceId for that permutation), but then you get into complex race condition scenarios. It also makes insert very complex and less efficient, as now you're doing a select immediately before the insert - a prime candidate for deadlocking in some isolation levels (so: that initial select would need to be UPDLOCK). The fact that the primary key will be unique will help avoid some of the more obvious problems, but you could still get into problems if you make assumptions about the value you're about to insert before you've fully inserted it, or if you don't explicitly expect the insert to fail.

Sign up to request clarification or add additional context in comments.

Comments

1

IDENTITY columns are not depending on other columns. Also, You are trying to derive ServiceID column from the incoming values of SharedKey, SiteId and not from the existing values of the table.

You have few options:

  • you need to define an INSTEAD OF INSERT trigger to handle this scenario. You need to see whether incoming values of SharedKey, SiteId are already present. If so, use the existing ServiceId. Otherwise, generate new one.

  • You need to have this logic as part of insert stored procedure. You need to see the whether incoming parameters: Sharedkey, SiteId are already present. If so use the existing ServiceId. otherwise, generate new one and insert into the table.

CAVEAT:

  • Going for triggers will make the transaction longer and it will affect concurrency and similarly, checking for existing values in the table and coming up with new ServiceId will make the transaction longer.
  • Going for stored procedures also, you need to take care of concurrency aspects(ISOLATION LEVELS) so that your generated serviceID is not creating Primary Key conflict. this kind of ISOLATION LEVEL setting will also affect concurrency.

IMHO, suggesting you below design changes:

  • Not to go for this kind of deriving of columns. The composite key constituents have to be independent of each other.
  • You can consider creating surrogate key, and have these columns as foreign keys from respective tables. Narrow clustered keys will give good performance, over composite keys.
  • Revisit normalization and create separate independent tables and create bridge table to handle this kind of many-many relationship

1 Comment

even with a stored procedure, you'd need to be aware of the concurrency aspects - i.e. you'd need to anticipate the insert to fail; personally, I'd usually go with surrogate keys here

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.