41

I have a table where each ID is repeated 3 times. there is a date in front of each id in each row.
I want to select entire row for each ID where date is latest. There are total 370 columns in this table i want all columns to get selected when i select that row.

Sample -

ID Name Date Marks .. .. .. 1 XY 4/3/2017 27 1 fv 4/3/2014 98 1 jk 4/3/2016 09 2 RF 4/12/2015 87 2 kk 4/3/2009 56 2 PP 4/3/2011 76 3 ee 4/3/2001 12 3 ppp 4/3/2003 09 3 lll 4/3/2011 23 

The Answer should be

ID Name Date Marks .. .. .. 1 XY 4/3/2017 27 2 RF 4/12/2015 87 3 lll 4/3/2011 23 

I am attempting as below -

select distinct ID,*,max(date) as maxdate from table 

Also i am trying this in Hive . so not sure if some sql functions dont work in Hive

Thanks

1

6 Answers 6

94

This question has been asked before. Please see this question.

Using the accepted answer and adapting it to your problem you get:

SELECT tt.* FROM myTable tt INNER JOIN (SELECT ID, MAX(Date) AS MaxDateTime FROM myTable GROUP BY ID) groupedtt ON tt.ID = groupedtt.ID AND tt.Date = groupedtt.MaxDateTime 
Sign up to request clarification or add additional context in comments.

3 Comments

looks like in the last line it should be AND tt.Date = groupedtt.MaxDateTime .... Thanks
Sure thing. If you liked the answer please upvote and select as the accepted answer.
what if date comes from user and same or prevoius dates come???
9

One way is:

select table.* from table join ( select ID, max(Date) as max_dt from table group by ID ) t on table.ID= t.ID and table.Date = t.max_dt 

Note that if you have multiple equally higher dates for same ID, then you will get all those rows in result

1 Comment

yes ..exactly thats' how it is happening ... then i think i can just remove the duplicate rows from entire database ... then those multiple rows will go away and for each id there will be only one row.
7

You can do this with a Correlated Subquery (That is a subquery wherein you reference a field in the main query). In this case:

SELECT * FROM yourtable t1 WHERE date = (SELECT max(date) from yourtable WHERE id = t1.id) 

Here we give the yourtable table an alias of t1 and then use that alias in the subquery grabbing the max(date) from the same table yourtable for that id.

5 Comments

this is a very slow way to do that.
@MaksimKniazev I disagree. MySQL's execution plan will be similar to the explicit INNER JOIN on an aggregated subquery in this case.
Takes forever when the data is huge.
@YaroslavDukal as opposed to a JOIN?
yes, join is more optimal
5

You can use a join to do this

SELECT t1.* from myTable t1 LEFT OUTER JOIN myTable t2 on t2.ID=t1.ID AND t2.`Date` > t1.`Date` WHERE t2.`Date` IS NULL; 

Only rows which have the latest date for each ID with have a NULL join to t2.

5 Comments

Old school. Nice.
This is giving me error as Both left and right aliases encountered in JOIN 'date'
Is that the error mysql is reporting? I don't know if its the problem but Date is a reserver word so I've updated my answer with backticks.
I think that's a HIVE error - My code is intended for MySql
LEFT OUTER JOIN is synonym to LEFT JOIN, its clearer that way I think to use the simpler terms...
1

Here's one way. The inner query gets the max date for each id. Then you can join that back to your main table to get the rows that match.

select * from <your table> inner join (select id, max(<date col> as max_date) m where yourtable.id = m.id and yourtable.datecolumn = m.max_date) 

Comments

-2

Have you tried the following:

SELECT ID, COUNT(*), max(date) FROM table GROUP BY ID; 

1 Comment

This will not return all of the fields in the table for the record with that id that has the max date.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.