0

I am new to Postgres, I used to work with MySQL.

My problem is simple, it is also never happens in MySQL. t1 is my main table and t2 holds the history of updates that occurred to a field in t1 called status. So whenever an entry in t1 changes it's status field, a new record is added to t2 with the current timestamp and the new status value.

Now I want to to get the latest status and timestamp of a specific entry in t1, for example that with id 1271. I can get the latest timestamp with the following query.

SELECT t1.id, t1.message, MAX(t2.creation_timestamp) FROM table_1 t1 LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id WHERE t1.id = 1271 GROUP BY t1.id,t1.message 

But when I try to also get the status of that specific row with the MAX timestamp.

SELECT t1.id, t1.message, t2.status, MAX(t2.creation_timestamp) FROM table_1 t1 LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id WHERE t1.id = 1271 GROUP BY t1.id,t1.message 

I get the following error:

ERROR: column "t2.status" must appear in the GROUP BY clause or be used in an aggregate function 

How can I get the status of the record with the MAX timestamp ?

2
  • Or maybe mysql is smart enough to understand that since the query will return one row from the MAX aggregate then it will return that other fields from that specific if they exist in the select. Anyway how to do it in Postgresql ? Commented Dec 30, 2016 at 11:14
  • Postgres would be smart enough if you define proper primary and foreign keys. See my answer Commented Dec 30, 2016 at 11:26

3 Answers 3

5

Your query is rejected by Postgres because it's invalid SQL. The problem exists in MySQL as well, you were just lucky so far because MySQL chooses to return random values rather then rejecting the invalid group by (you might want to read this or this for details on the MySQL implementation)

The most efficient solution in Postgres is to use distinct on ()

SELECT distinct on (t1.id) t1.id, t1.message, t2.status, t2.creation_timestamp FROM table_1 t1 LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id WHERE t1.id = 1271 ORDER BY t1.id, t2.creation_timestamp DESC; 

However, if id is defined as the primary key of table_1 and there is a proper foreign key relationship between the two tables, Postgres will accept the partial group by as it knows that the id is unique.

psql (9.6.1) Type "help" for help. postgres=> create table table_1 (id integer primary key, message text); CREATE TABLE postgres=> create table table_2 (table_1_id integer references table_1, status text, creation_timestamp timestamp); CREATE TABLE postgres=> insert into table_1 postgres-> values postgres-> (1271, 'one'), postgres-> (1272, 'two'), postgres-> (1273, 'three'); INSERT 0 3 postgres=> insert into table_2 postgres-> values postgres-> (1271, 'active', timestamp '2016-12-30 10:00:00'), postgres-> (1271, 'active', timestamp '2016-12-30 11:00:00'), postgres-> (1271, 'active', timestamp '2016-12-30 12:00:00'), postgres-> (1272, 'active', timestamp '2016-12-30 11:00:00'), postgres-> (1272, 'active', timestamp '2016-12-30 12:00:00'), postgres-> (1273, 'active', timestamp '2016-12-30 13:00:00'), postgres-> (1273, 'active', timestamp '2016-12-30 13:00:00'); INSERT 0 7 postgres=> SELECT postgres-> t1.id, postgres-> t1.message, postgres-> MAX(t2.creation_timestamp) postgres-> FROM table_1 t1 postgres-> LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id postgres-> WHERE t1.id = 1271 postgres-> GROUP BY t1.id postgres-> ; id | message | max ------+---------+--------------------- 1271 | one | 2016-12-30 12:00:00 (1 row) 

SQLFiddle example: http://sqlfiddle.com/#!15/7cfc8/1

Sign up to request clarification or add additional context in comments.

Comments

1

you should have latest status on table_1, you should not need it from table_2, you need only latest timestamp

SELECT t1.id, t1.message, t1.status, tmax.creation_timestamp from table_1 t1 left join ( select table_1_id, MAX(creation_timestamp) creation_timestamp from table_2 group by table_1_id ) tmax on tmax.table_1_id = t1.id WHERE t1.id = 1271 

Comments

0

Join using subquery selecting max(creation_timestamp)

SELECT t1.id, t1.message, t2.status, t2.creation_timestamp FROM table_1 t1 LEFT JOIN table_2 t2 ON t1.id = t2.table_1_id and t2.creation_timestamp = (SELECT MAX(creation_timestamp) from table_2 t3 where t1.id = t3.table_1_id ) WHERE t1.id = 1271 

3 Comments

you missed that table_2 holds info of different ID's (table_1_id)
I still don't get what I missed. I just added additional condition to the join clause. What is wrong about it ? Join works as it worked before but instead of joining all timestamps it joins only the one which has tme max(creation_timestamp)
the select max() is wrong, must be grouped by table_1_id

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.