3

I'm trying to copy 1.5M rows from one database to another database. I've done many searches on SO but can't get this to work.

The source table has duplicates (using Col1 + Col2 from Src table) and I need to ensure no duplicates are inserted into the new destination tabe. This is the SQL I have:

INSERT INTO DestDb.dbo.DestTable ([Col1], [Col2]) SELECT [Col1], DATEADD(dd, DATEDIFF(dd, 0, [Col2]), 0) FROM dbo.SrcTable as Table1 WHERE NOT EXISTS ( SELECT 1 FROM DestDb.dbo.DestTable WHERE DestDb.dbo.DestTable.Col1 = Table1.Col1 AND DATEDIFF(DAY, DestDb.dbo.DestTable.Col2, Table1.Col2) = 0 ) 

DestDb.dbo.DestTable has a composite key of Col1 + Col2

DestDb.dbo.DestTable.Col1 is (PK, nvarchar(128), not null)

DestDb.dbo.DestTable.Col2 is (PK, datetimeoffset(7), not null)

dbo.SrcTable.Col1 is (nvarchar(max), null)

dbo.SrcTable.Col2 is (datetime2(7), not null)

I get this error:

Msg 2627, Level 14, State 1, Line 1
Violation of PRIMARY KEY constraint 'PK_DestTable'. Cannot insert duplicate key in object 'dbo.DestTable'. The duplicate key value is (AAAA, 2011-10-13 00:00:00.0000000 +00:00).

I'm using datediff because the destination table only needs to record the date portion from the source column (no time values needed).

I'm really stumped because a direct insert into the destination table using the following will work fine:

INSERT INTO [dbo].[DestTable] ([Col1], [Col2]) VALUES ('AAAA', GETDATE()) GO 

2 Answers 2

2

You need to eliminate duplicates first, such as here with a GROUP BY:

with source as ( SELECT [Col1], DATEADD(dd, DATEDIFF(dd, 0, [Col2]), 0) as Col2 FROM dbo.SrcTable as Table1 ), data as ( select Col1,Col2 from source group by Col1,Col2 ) INSERT INTO DestDb.dbo.DestTable ([Col1], [Col2]) SELECT Col1,Col2 FROM data ; 

You could use SELECT DISTINCT in the second CTE instead of the GROUP BY, but the use of SELECT DISTINCT is considered an anti-pattern by many.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, almost there. I have other columns (I should have mentioned in my post) that I need to transfer - they are just data columns (not keys). Such as from SrcTable.Col3 to DestTable.Col3 (they are both just strings). I get 'Column 'source.Col3' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.'
Then apply the max() aggregate to the additional columns. The advantage of max() over min() is that all real strings are strictly greater than the empty string ' '.
0

I hope this should work, as you dont need time part then convert will remove it and group by will pick unique combinations of col1 and col2. Also, you have written that col1 and col2 of DestTable is PK, means there is a composite primary key of (col1, col2) on DestTable:

insert into DestTable select col1, cast(Convert(varchar,Col2,101) as datetimeoffset(7)) Col2 from SrcTable group by col1, Convert(varchar,Col2,101) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.