1

Below is the sample data extract i have. And i wanted to delete the duplicate row (last one in this example) as below. I was wondering how can i easily fetch this without that extra record in select query

ID YEAR CNT VOLUME INT_VOLUME RATE INT_RATE GM GM_RCNT 545 2016 12 5508 5508 1604 1604 0.71 NULL 545 2017 5 1138 2731 824 1977 0.28 -50.42 545 2018 NULL NULL -45 2351 NULL NULL NULL 626 2016 12 679862 679862 252693 252693 0.63 NULL 626 2017 12 705365 705365 282498 282498 0.6 3.75 626 2018 12 707472 707472 291762 291762 0.59 0.3 626 2018 NULL NULL 711372 NULL 295186 NULL NULL --Filter such rows in select 
2
  • 1
    How do you define duplicate? It is not obvious. Commented May 22, 2019 at 1:55
  • In the above case we have year (2018) coming twice for 626 ID. and I wanted to retain the earlier record. perhaps min(rowid)? Commented May 22, 2019 at 1:58

1 Answer 1

2

You can choose one year for each id using row_number():

select t.* from (select t.*, row_number() over (partition by id, year order by id) as seqnum from t ) t where seqnum = 1; 

This chooses an arbitrary row to keep. You can adjust the order by to refine which row you want to keep. You can order by rowid, but there is no guarantee that it is the "earliest" row. You need a date or sequence column for that purpose.

Sign up to request clarification or add additional context in comments.

1 Comment

Man no doubt you are GENIUS with 812k reputations..THANK YOU.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.