Selecting distinct values for multiple columns

Question

I have a table where many pieces of data match to one in another column, similar to a tree, and then data at the 'leaf' about each specific leaf

eg

Food Group Name Caloric Value Vegetables Broccoli 100 Vegetables Carrots 80 Fruits Apples 120 Fruits Bananas 120 Fruits Oranges 90

I would like to design a query that will return only the distinct values of each column, and then nulls to cover the overflow

eg

Food group Name Caloric Value Vegetables Broccoli 100 Fruit Carrots 80 Apples 120 Bananas 90 Oranges

I'm not sure if this is possible, right now I've been trying to do it with cases, however I was hoping there would be a simpler way

did you mean "Carrots 80, Fruits Apples 120"? Also, if you are trying to buld a tree structure, why not do it in your server-side code? — Matt
– Matt, Commented Mar 4, 2011 at 16:13
I meant to put Fruit where I put it, but if you know a way to do it by where Fruit would be on the next line that works too. I will be reading this in to Visual studio where I will be creating a tree in a windows form, it isn't really a tree in the database, it just is similar in that it has categories and subcategories. I just want avoid having to isolate the distinct values in C# because this is a very large database — jas
– jas, Commented Mar 4, 2011 at 16:23

Andriy M · Accepted Answer · 2011-07-06 13:03:02Z

Seems like you are simply trying to have all the distinct values at hand. Why? For displaying purposes? It's the application's job, not the server's. You could simply have three queries like this:

SELECT DISTINCT [Food Group] FROM atable; SELECT DISTINCT Name FROM atable; SELECT DISTINCT [Caloric Value] FROM atable;

and display their results accordingly.

But if you insist on having them all in one table, you might try this:

WITH atable ([Food Group], Name, [Caloric Value]) AS ( SELECT 'Vegetables', 'Broccoli', 100 UNION ALL SELECT 'Vegetables', 'Carrots', 80 UNION ALL SELECT 'Fruits', 'Apples', 120 UNION ALL SELECT 'Fruits', 'Bananas', 120 UNION ALL SELECT 'Fruits', 'Oranges', 90 ), atable_numbered AS ( SELECT [Food Group], Name, [Caloric Value], fg_rank = DENSE_RANK() OVER (ORDER BY [Food Group]), n_rank = DENSE_RANK() OVER (ORDER BY Name), cv_rank = DENSE_RANK() OVER (ORDER BY [Caloric Value]) FROM atable ) SELECT fg.[Food Group], n.Name, cv.[Caloric Value] FROM ( SELECT fg_rank FROM atable_numbered UNION SELECT n_rank FROM atable_numbered UNION SELECT cv_rank FROM atable_numbered ) r (rank) LEFT JOIN ( SELECT DISTINCT [Food Group], fg_rank FROM atable_numbered) fg ON r.rank = fg.fg_rank LEFT JOIN ( SELECT DISTINCT Name, n_rank FROM atable_numbered) n ON r.rank = n.n_rank LEFT JOIN ( SELECT DISTINCT [Caloric Value], cv_rank FROM atable_numbered) cv ON r.rank = cv.cv_rank ORDER BY r.rank

I was doing them all separately before, however it was taking a long time, and I am trying to make it as quick as possible.
The query is taking more than 10 minutes to run using only the first three columns, so I'm not sure if it does what I need, but it's definately not going to be fast enough It may be that what I am trying to do isn't possible.
@jas: Maybe your table lacks indexes on those three columns? As for my solution, there was a mistake in it, which I've now fixed, only I doubt it has made the query faster. Anyway, I added a testing table (as a CTE) to demonstrate that the method works.
The lack of an index could be the issue, but I am working with a view that can not be indexed because I need to use left joins to create it
@jas: Doesn't that mean that the data you are trying to get from the view are already stored somewhere as distinct values? Then why select from the view if you can select from the original tables?

Michael Blackburn · Accepted Answer · 2011-03-04 16:54:30Z

I guess what I would want to know is why you need this in one result set? What does the code look like that would consume this result? The attributes on each row have nothing to do with each other. If you want to, say, build the contents of a set of drop-down boxes, you're better off doing these one at a time. In your requested result set, you'd need to iterate through the dataset three times to do anything useful, and you would need to either check for NULL each time or needlessly iterate all the way to the end of the dataset.

If this is in a stored procedure, couldn't you run three separate SELECT DISTINCT and return the values as three results. Then you can consume them one at a time, which is what you would be doing anyway I would guess.

If there REALLY IS a connection between the values, you could add each of the results to an array or list, then access all three lists in parallel using the index.

I am creating a custom set of filters for a report, so the user can see all of the active values, and isolate the data they are looking for. I would read each of the values along with their column name in to a list of objects and then use this list to create the trees Previously I was doing it with separate queries, but there are 13 columns, with the potential of more being added, so to avoid multiple connections to the database I would like to do it all in one go

user330315 · Accepted Answer · 2011-03-04 17:14:09Z

Something like this maybe?

 select * from ( select case when row_number() over (partition by fruit_group) = 1 then fruit_group else null end as fruit_group, case when row_number() over (partition by name) = 1 then name else null end as name, case when row_number() over (partition by caloric) = 1 then caloric else null end as caloric from your_table ) t where fruit_group is not null or name is not null or caloric is not null

But I fail to see any sense in this

I can't seem to get this query working because there is no order by in the over clause?
@jas: the order by is not really needed as the partition by will only select the same values anyway. But if you need one, I'd suggest to order by the respective column over (partition by fruit_group order by fruit_group)

Collectives™ on Stack Overflow

Selecting distinct values for multiple columns

3 Answers 3

6 Comments

1 Comment

2 Comments

Linked

Hot Network Questions