4

SQL Server 2008.

I have few pieces of code working together and the effect is rather strange so I decided to share.

I have a table with emails - for simplicity let as assume there are 2 columns: ID identity(1,1) int primary key email varchar(900)

Also there is a unique key on email column with IGNORE_DUP_KEY = ON.

On the other side of a problem is a merge statement:

merge into dbo.email using ( select distinct email t from #t ) p ON t = email when not matched by target then insert (email) values (t); 

It is worth to note that #t is basically (for purpose of this query) a one column table with email varchar(500).

What is surprising is that the query failed:

"Violation of UNIQUE KEY constraint 'uq_email'. Cannot insert duplicate key in object dbo.email"

However this works flawlessly:

insert into dbo.email (email) select email from #t 

While I can obviously work around the problem (the insert statement is faster anyway so it is a way to go) I have no idea why the MERGE statement failed. Any ideas, anyone?

EDIT: Full use case: Step 1:

create table #temp ( col1 varchar(500), col2 varchar(500), col3 varchar(500), col4 varchar(500), col5 varchar(500), email varchar(500), id int) 

Step 2:

#temp is populated from CSV file for many purposes

Step 3:

merge into dbo.email using ( select distinct email t from #temp ) p ON t = email when not matched by target then insert (email) values (t);

Step 0 - CREATE script for dbo.email:

CREATE TABLE dbo.email ( id int identity(1,1) not null, email varchar(900) null, loaddate date default null, constraint [PK__email__1111] PRIMARY KEY CLUSTERED ( id asc ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY, CONSTRAINT [uq_email] UNIQUE NONCLUSTERED ( EMAIL asc ) ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY ) ON PRIMARY 

I can't make much more without giving you hunderds of gigabytes of data which is out of a question obviously.

6
  • Based on the information you have given the statement should work correctly and not attempt to insert where matching rows already exist. Is there a trigger on the target email table? Commented Feb 10, 2012 at 10:37
  • No. There is really nothing else there. I'm also surprised that it failed as I see no reason (obviously). Maybe it is a bug in SQL Server. After all - as my prof used to say - it is just a software :-. Commented Feb 10, 2012 at 10:44
  • Can you post a complete example which demonstrates the problem? I've created a test based on the information you've provided and it works as expected. Commented Feb 10, 2012 at 11:00
  • A side point - if you need to ignore duplicates, what's the value in having the unique constraint at all? Commented Feb 10, 2012 at 11:17
  • The table needs to have unique emails however there are many data sources inserting data there and they may have duplicates - in themselves and between each other. Commented Feb 10, 2012 at 11:28

1 Answer 1

9

Your email column allows for null values. If you have a null value in target and a null value in source they will not be matched in the on expression. You will have a unique constraint exception when you are inserting the second null value.

Try this instead:

merge into dbo.email using ( select distinct email t from #t where email is not null ) p ON t = email when not matched by target then insert (email) values (t); 

Update:
Regarding ignore_dup_key you should read the remarks section in documentation for the merge statement:

If IGNORE_DUP_KEY is set to ON for any unique indexes on the target table, MERGE ignores this setting.

To have a unique constraint in SQL Server that allows for multiple null values you should add a unique filtered index instead.

create unique index UX_xx on TableName(ColName) where ColName is not null 
Sign up to request clarification or add additional context in comments.

1 Comment

Brilliant catch. Also I will probably never get used to the fact that in SQL Server - unlike i.e. ORacle - unique column allows only one null value. -> However - while it does explain why the merge didn't catch null = null situation, it does not explain why ignore_dup_key won't work in this case. It works for insert statement, after all.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.