22

I've the following table (my_data):

year | X | Y -----+-----+----- 2010 | A | 10 2011 | A | 20 2011 | B | 99 2009 | C | 30 2010 | C | 40 


what is the best / smallest SQL statement to retrieve only the data related to the highest year and grouped by 'X' , like this:

year | X | Y -----+-----+----- 2011 | A | 20 2011 | B | 99 2010 | C | 40 


Note that this result table will be used in a join.

2
  • 2
    Why 2009 is not their in the result????? Commented Jun 1, 2011 at 8:34
  • 1
    I forgot to include de information that the data should be grouped by the 'X' column. Commented Jun 1, 2011 at 8:37

12 Answers 12

29
select year, x,y from ( select year, x, y, max(year) over(partition by x) max_year from my data ) where year = max_year 
Sign up to request clarification or add additional context in comments.

2 Comments

Yes, that's quite an elegant way to do this
16
select * from ( select year, x, y, row_number() over (partition by x order by year desc ) rn from my_data ) where rn = 1 

1 Comment

Thanks for this answer, it works correct. However I did accept the 'shurik' for the answer because it's a little bit more easy to understand.
5

It's a lot simpler than the other solutions:

SELECT x, max(year), MAX(y) KEEP (DENSE_RANK FIRST ORDER BY year DESC) FROM table GROUP BY x 

Comments

4

You could also be portable and use an OUTER JOIN :

select t1.year, t1.x, t1.y from my_data t1 left join my_data t2 on t2.x = t1.x and t2.year > t1.year where t2.x is null 

Comments

3

Starting in Oracle AI Database 26ai (23.26.0), you can use the qualify clause to filter window functions:

with t (year, x, y) as ( values (2010, 'A', 10), (2011, 'A', 20), (2011, 'B', 99), (2009, 'C', 30), (2010, 'C', 40) ) select * from t qualify year = max ( year ) over ( partition by x ); YEAR X Y ---------- - ---------- 2011 A 20 2011 B 99 2010 C 40 

Comments

2

Gary Myers, your solution does not work, if, for example, for value A, year is smaller than 2010 and that year has maximum value. (FOR example, if row 2005,A,50 existed) In order to get correct solution, use the following. (which just swaps values)

SELECT x, max(y), MAX(year) KEEP (DENSE_RANK FIRST ORDER BY y DESC) FROM test GROUP BY x 

Comments

2

You can use common table expression (CTE), works also with duplicated rows (if required) Execution plan is the same, more or less

;With my_data_cte as ( SELECT [year], x,y,ROW_NUMBER() OVER ( PARTITION BY x ORDER BY [year] desc) as rn FROM [dbo].[my_data]) select [year], x,y from my_data_cte where rn = 1 

Comments

2
select year, x, y from my_data stable where stable.year = (select max(year) from my_data tables where tables.x = stable.x); 

1 Comment

@Nikolay It seems that you add highlight to the original code. right? Thanks
0
-- I had a slightly different case and just wandering why this one should't work SELECT my_data.x , my_data.y , my_data1.max_year FROM my_data INNER JOIN ( SELECT x , max (year ) as max_year FROM my_data -- WHERE 1=1 -- AND FILTER1=VALUE1 GROUP BY my_data.x ) my_data1 ON ( my_data.x = my_data1.x ) 

2 Comments

Why LEFT JOIN rather than INNER JOIN? Why WHERE 1=1? Also using a join is less efficient than using MAX(...) KEEP( DENSE_RANK ... ).
txn for the correction => fixed to inner join + added the where for dev user friendliness ...
0

You can select the most recent year for each X by using a subquery in your criteria:

select a.year, a.x, a.y from my_data a where a.year = ( select max(a_yr.year) from my_data a_yr where a_yr.x = a.x); 

Data:

year | X | Y -----+-----+----- 2010 | A | 10 2011 | A | 20 2011 | B | 99 2009 | C | 30 2010 | C | 40 

Results:

year | X | Y -----+-----+----- 2011 | A | 20 2011 | B | 99 2010 | C | 40 

Based on my limited testing, this method seems to be faster than using partition by.

Comments

-3

this can also be the solution

select greatest( (e),(g),(c),(a),(b) ) as latestdate from abc

1 Comment

Thank you for this code snippet, which may provide some immediate help. A proper explanation would greatly improve its educational value by showing why this is a good solution to the problem, and would make it more useful to future readers with similar, but not identical, questions. Please edit your answer to add explanation, and give an indication of what limitations and assumptions apply.
-4

The simplest is

Select * from table where year = (select max(year) from table) 

It will possibly cause a table scan unless there is an index on year. But with an index should be performant

2 Comments

But this will not return the value for 'C' and 2010, as requested by the OP.
That won't produce the desired result

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.