Please see the DDL below:
CREATE Table #Grouping1 (GroupID int, SystemID int, RecordID int, Name varchar(100), DateOfBirth datetime) INSERT INTO #Grouping1 values (1,1,1,'Mark Williams', '1980-01-01') INSERT INTO #Grouping1 values (1,2,128,'Mark Welliams', '1980-01-01') INSERT INTO #Grouping1 values (1,3,36,'Marko Williams', '1980-01-01') INSERT INTO #Grouping1 values (2,1,18,'Anne Smith', '1960-01-23') INSERT INTO #Grouping1 values (2,2,64,'Anna Smyth', '1960-01-23') INSERT INTO #Grouping1 values (2,8,23,'Annie Smith', '1960-01-23') The table shows that there are three records in three systems for one person and three records in three systems for another person (the group id says the people are linked).
I am in the process of upgrading this system and the group IDs have changed. For example, please see the DDL below:
CREATE Table #Grouping2 (GroupID int, SystemID int, RecordID int, Name varchar(100), DateOfBirth datetime) INSERT INTO #Grouping2 values (187,1,1,'Mark Williams', '1980-01-01') INSERT INTO #Grouping2 values (187,2,128,'Mark Welliams', '1980-01-01') INSERT INTO #Grouping2 values (208,1,18,'Anne Smith', '1960-01-23') INSERT INTO #Grouping2 values (208,2,64,'Anna Smyth', '1960-01-23') INSERT INTO #Grouping2 values (208,8,23,'Annie Smith', '1960-01-23') I want to check that all the people that were linked together are still linked together. I was thinking about doing something like this.
select * into #OldTable from ( select grouping1a.groupid, grouping1a.systemid as systemid1,grouping1a.recordid as recordid1, grouping1b.systemid as systemid2,grouping1b.recordid as recordid2 from #grouping1 as grouping1a inner join #grouping1 as grouping1b on grouping1a.groupid=grouping1b.groupid where not (grouping1a.SYSTEMID=grouping1B.SYSTEMID AND grouping1A.recordID=grouping1B.recordID) ) as OldTable select * into #NewTable from ( select grouping1a.groupid, grouping1a.systemid as systemid1,grouping1a.recordid as recordid1, grouping1b.systemid as systemid2,grouping1b.recordid as recordid2 from #grouping2 as grouping1a inner join #grouping2 as grouping1b on grouping1a.groupid=grouping1b.groupid where not (grouping1a.SYSTEMID=grouping1B.SYSTEMID AND grouping1A.recordID=grouping1B.recordID) ) As NewTable select distinct #OldTable.groupid from #oldtable full outer join #newtable on #oldtable.systemid1=#newtable.systemid1 and #oldtable.recordid1=#newtable.recordid1 and #oldtable.systemid2=#newtable.systemid2 and #oldtable.recordid2=#newtable.recordid2 where #oldtable.systemid1 is null or #newtable.systemid1 is null This would show me all the groups that have a problem. I cannot do this because there are over 100,000,000 rows in #Grouping1 and #Grouping2. Is there another way to approach this e.g. using a Checksum or HashByte?