This might be too subjective, but it's been puzzling me some time.
If you have a Fact table that allows duplicates with 10 dimensions that do not, do you really need a primary key?
Why Are There Duplicates?
It's a bit tricky, but ideally each duplicate is actually valid. There is just not a unique identifier to separate them from the source system recording the record. We don't own that system so there is no way to ever change it.
Data
The data is in batch and only include the previous days worth of records. Therefore, in the event of a republish. We just drop the entire days worth of records and republish the new day of records without the use of a primary key.
This is how I would fix bad data.
Generate A Primary Key Already
I can, but if it's never used or have anyway to validate if the duplicate is legit, why do it?
DELETE TOP(3)?