0

I have a table with duplicate skus.

skua skua skub skub skub skuc skuc skud SELECT sku, COUNT(1) AS `Count` FROM products GROUP BY sku; 

shows me all the skus that have duplicates and the number of duplicates

 skua 2 skub 3 skuc 2 skud 1 

I am trying to find how many there are with 2 duplicates, 3 duplicates etc.

i.e.

 duplicated count 1 1 (skud) 2 2 (skua, and skuc) 3 1 (skub) 

and I don't know how to write the sql. I imagine it needs a subselect...

thanks

1
  • 1
    wrap your query in another one: select group_concat(sku), Count from (...your other query...) group by count Commented Jun 24, 2014 at 18:54

2 Answers 2

1

Just use your current query as an inline view, and use the rows from that just like it was from a table.

e.g.

SELECT t.Count AS `duplicated` , COUNT(1) AS `count` FROM ( SELECT sku, COUNT(1) AS `Count` FROM products GROUP BY sku ) t GROUP BY t.Count 

MySQL refers to an inline view as a "derived table", and that name makes sense, when we understand how MySQL actually processes that. MySQL runs that inner query, and creates a temporary MyISAM table; once that is done, MySQL runs the outer query, using the temporary MyISAM table. (You'll see that if you run an EXPLAIN on the query.)

Above, I left your query just as you formatted it; I'd tend to reformat your query, so that entire query looks like this:

SELECT t.Count AS `duplicated' , COUNT(1) AS `count` FROM ( SELECT p.sku , COUNT(1) AS `Count` FROM products p GROUP BY p.sku ) t GROUP BY t.Count 

(Just makes it easier for me to see the inner query, and easier to extract it and run it separately. And qualifying all column references (with a table alias or table name) is a best practice.)

Sign up to request clarification or add additional context in comments.

3 Comments

I missed that OP wanted to return a list of the sku; I viewed that part as comments next to the result set, illustrating how each "count" value was derived. Other answers sufficiently demonstrate the use of the GROUP_CONCAT aggregate function.
this works perfectly, thank you. Actually I showed the breakdown for illustration to help explain my question. In reality the Group Concat would not be practical since the duplicates run into the thousands.
@sdfor: given that you didn't know how to return this resultset, I thought it would be helpful to give some commentary about how this works, rather than just posting a SQL statement, so that you (and future readers) will be able to apply the same pattern to a similar problem in the future. (And you will run into a limitation of the GROUP_CONCAT function.)
1
select dup_count as duplicated, count(*) as `count`, group_concat(sku) as skus from ( SELECT sku, COUNT(1) AS dup_count FROM products GROUP BY sku ) tmp_tbl group by dup_count 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.