Distinct values from an array?

Question

Following tables:

CREATE TEMPORARY TABLE guys ( guy_id integer primary key, guy text ); CREATE TEMPORARY TABLE sales ( log_date date, sales_guys integer[], sales smallint ); INSERT INTO guys VALUES(1,'john'),(2,'joe'); INSERT INTO sales VALUES('2016-01-01', '{1,2}', 2),('2016-01-02','{1,2}',4);

Following query works great to show names on a given date:

SELECT log_date, sales_guys, ARRAY_AGG(guy), sales FROM sales JOIN guys ON guys.guy_id = ANY(sales.sales_guys) GROUP BY log_date, sales_guys, sales ORDER BY log_date ASC; log_date | sales_guys | array_agg | sales ------------+------------+------------+------- 2016-01-01 | {1,2} | {john,joe} | 2 2016-01-02 | {1,2} | {john,joe} | 4

Following query problematically gives me a name per date per guy, so here each name twice, and so on):

SELECT sales_guys, ARRAY_AGG(guy), SUM(sales) AS sales FROM sales JOIN guys ON guys.guy_id = ANY(sales.sales_guys) GROUP BY sales_guys;

Yields:

 sales_guys | array_agg | sales ------------+---------------------+------- {1,2} | {john,joe,john,joe} | 12

Is there a way to somehow reduce the ARRAY_AGG call to give only the unique names?

Please clarify what you want to aggregate exactly. Sales for the same team? Would you group {1,2} with {2,1}? And always declare your version of Postgres. — Erwin Brandstetter
– Erwin Brandstetter, Commented Nov 25, 2016 at 20:03

Community · Accepted Answer · 2020-06-20 09:12:55Z

There is no kind of order you can trust without ORDER BY. Except that elements of arrays, when unnested, come in array order. If your query does more with the result, it may be re-ordered, though.

You an simply add ORDER BY to any aggregate function in Postgres:

SELECT s.sales_guys, ARRAY_AGG(DISTINCT g.guy ORDER BY g.guy) AS names, SUM(s.sales) AS sum_sales FROM sales s JOIN guys g ON g.guy_id = ANY(s.sales_guys) GROUP BY s.sales_guys;

But that's obviously not the original order of array elements. And the query has other issues ... Neither IN nor = ANY() care about order of elements in the set, list or array on the right side:

How to use ANY instead of IN in a WHERE clause with Rails?

Proper solution

For this task (attention to the details!):

Get the total sales per array sales_guys, where the order of elements makes a difference (arrays '{1,2}' and '{2,1}' are not the same) and sales_guys has neither duplicate nor NULL elements. Add an array of resolved names in matching order.

Use unnest() with WITH ORDINALITY. and aggregate arrays before you resolve names, that's cheaper and less error prone.

SELECT s.*, g. FROM ( SELECT sales_guys, sum (sales) AS total_sales -- aggregate first in subquery FROM sales GROUP BY 1 ) s , LATERAL ( SELECT array_agg(guy ORDER BY ord) AS names -- order by original order FROM unnest(s.sales_guys) WITH ORDINALITY sg(guy_id, ord) -- with order of elements LEFT JOIN guys g USING (guy_id) -- LEFT JOIN to add NULL for missing guy_id ) g;

The LATERAL subquery can be joined with unconditional CROSS JOIN - comma (,) is shorthand notation - because the aggregate in the subquery guarantees a result for every row. Else you'd use LEFT JOIN LATERAL .. ON true.

Detailed explanation:

clemens · Accepted Answer · 2016-11-25 18:05:35Z

0

You can use DISTINCT inside of the aggregate:

SELECT sales_guys, ARRAY_AGG(DISTINCT guy), SUM(sales) AS sales FROM sales JOIN guys ON guys.guy_id = ANY(sales.sales_guys) GROUP BY sales_guys;

answered Nov 25, 2016 at 18:05

clemens

17.9k12 gold badges52 silver badges71 bronze badges

3 Comments

Wells Over a year ago

dang, that's obvious. Woof. Is there some kind of order you can trust in the DISTINCT call? Would they be ordered here by guy_id in guys or guy in guys or is it effectively random?

clemens Over a year ago

I think there is no defined order. You can fetch the guys in the join by an ordered subquery or view first to get an ordered array.

Wells Over a year ago

hmm, I kind of hoped to get a guy_id array, and a guy array such that I could trust guy_id[1] == guy[1] but it doesn't seem possible w/o a subquery.

Collectives™ on Stack Overflow

Distinct values from an array?

2 Answers 2

Proper solution

Comments

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Proper solution

Comments

3 Comments

Linked

Related