5

What if I want to search for a single row in a table with a decrementing precision, e.g. like this:

SELECT * FROM image WHERE name LIKE 'text' AND group_id = 10 LIMIT 1 

When this gives me no result, try this one:

SELECT * FROM image WHERE name LIKE 'text' LIMIT 1 

And when this gives me no result, try this one:

SELECT * FROM image WHERE group_id = 10 LIMIT 1 

Is it possible to do that with just one expression?

Also there arises a problem when I have not two but e.g. three or more search parameters. Is there a generic solution for that? Of course it would come in handy when the search result is sorted by its relevance.

2
  • Do you mean name LIKE '%text%' or name = 'text'? name LIKE 'text' is rather pointless. This is important for performance and has bearing on the solution. Also, the second part of your question is unclear. Why sort when you only return one row? Please define more clearly what you are after. Commented Jan 16, 2013 at 18:30
  • It's just a detail but it could be anything :) Commented Jan 30, 2013 at 10:46

4 Answers 4

5

Test setup

CREATE TABLE image ( image_id serial PRIMARY KEY , group_id int NOT NULL , name text NOT NULL ); 

Indexes are the key ingredient for performance. Ideally, you create these two in addition to the primary key:

CREATE INDEX image_name_grp_idx ON image (name, group_id); CREATE INDEX image_grp_idx ON image (group_id); 

The second may not be necessary, depending on data distribution and other details. See:

Query

Update: this becomes unreliable in Postgres 11 or later when Parallel Append is used for big sets! Consider this question and answers (incl. a reliable alternative in my answer):


This should be the fastest possible query for your case:

SELECT * FROM image WHERE name = 'name105' AND group_id = 10 UNION ALL SELECT * FROM image WHERE name = 'name105' UNION ALL SELECT * FROM image WHERE group_id = 10 LIMIT 1; 

fiddle
Old sqlfiddle

LIKE without wildcard character is equivalent to =

The LIMIT clause applies to the whole query. Postgres is smart enough not to execute later legs of the UNION ALL as soon as it has found enough rows to satisfy the LIMIT. Consequently, for a match in the first SELECT of the query, the output of EXPLAIN ANALYZE looks like this (scroll to the right!):

 Limit (cost=0.00..0.86 rows=1 width=40) (actual time=0.045..0.046 rows=1 loops=1) Buffers: local hit=4 -> Result (cost=0.00..866.59 rows=1002 width=40) (actual time=0.042..0.042 rows=1 loops=1) Buffers: local hit=4 -> Append (cost=0.00..866.59 rows=1002 width=40) (actual time=0.039..0.039 rows=1 loops=1) Buffers: local hit=4 -> Index Scan using image_name_grp_idx on image (cost=0.00..3.76 rows=2 width=40) (actual time=0.035..0.035 rows=1 loops=1) Index Cond: ((name = 'name105'::text) AND (group_id = 10)) Buffers: local hit=4 -> Index Scan using image_name_grp_idx on image (cost=0.00..406.36 rows=500 width=40) (never executed) Index Cond: (name = 'name105'::text) -> Index Scan using image_grp_idx on image (cost=0.00..406.36 rows=500 width=40) (never executed) Index Cond: (group_id = 10) Total runtime: 0.087 ms

Bold emphasis mine.

Do not add an outer ORDER BY clause, this would void the effect. Then Postgres would have to consider all rows before returning the top row.

Final questions

Is there a generic solution for that?

This is the generic solution. Add as many SELECT statements as you want.

Of course it would come in handy when the search result is sorted by its relevance.

There is only one row in the result with LIMIT 1. Kind of voids sorting.

Sign up to request clarification or add additional context in comments.

8 Comments

Very convincing indeed. Today/yesterday I've learnt something new.
@Erin: Your query is awesome, and postgres is just, awesome! The question has LIKE though, not = in the second comparison. This may make a difference in some scenario. Don't you think.
@ypercube: Thanks. :) Did you see my first paragraph and the comment under the question concerning LIKE vs. =? The principal of UNION ALL stopping evaluation as soon as it has found enough rows should apply to any SELECT statement. If we'd be talking about fuzzy string matching, things would get more complicated ...
Oh, ok, just noticed the first line. I was wondering for the case when the query has LIKE '%text%' and there is a row with group_id = 10 but none that matches the LIKE. In that case, after the first part of the UNION is run (and returns no rows), the 2nd part will be run (which will also return no rows). And then the 3rd part , which will give 1 result. But if we had run the three parts with the 1st-3rd order, it would be faster (and there would be really no need to run the 2nd part in that specific case).
@ypercube: Yeah, if one particular condition is much more expensive than others, combining SELECTs may help to further optimize. For non-left-anchored LIKE I would start with a GIN index using gin_trgm_ops to support that.
|
3

It's late and I don't feel like writing out a full solution, but if I needed this I would probably create a customer function that returned a customer type, record or a table (depending on what your needs are). The advantage to this would be that once you found your record, you could stop.

Making the number of params be dynamic will make it a bit more challenging. Depending on your version of PostgreSQL (and the extension available to you), you might be able to pass in an hstore or json and dynamically build the query.

Maybe not the greatest SO answer, but it's more than a comment and hopefully some food for thought.

6 Comments

If I understood you correctly, your assumption that a UDF will return rows one by one is wrong. Can't find a link to augment this yet.
The OP doesn't exactly say what he is trying to do here. Maybe he just needs the record's primary key value back? (obviously a guess here). I was just trying to point out that you could wrap the logic he wanted above in a UDF and stop when the record was found and return something (that something is not clear). I was just trying to highlight an approach to solving the problem that had not been mentioned yet. Also, it seemed like if he wanted to make it more flexible, that a UDF was likely the way to go but would be messy.
I did not want to argue with your overall point, just nitpicked a bit :)
A UNION ALL query with a LIMIT clause happens to stop evaluation automatically as soon as it has found enough rows. I demonstrate in another answer. Barring that I would have gone for a function, too.
@dezso - nitpicking is always welcome! Thanks for your comments!
|
2
SELECT *, CASE WHEN name like 'text' AND group_id = 10 THEN 1 WHEN name like 'text' THEN 2 WHEN group_id = 10 THEN 3 ELSE 4 END ImageRank FROM image WHERE ImageRank <> 4 ORDER BY ImageRank ASC LIMIT 1 

This would be a pseudo-solution approach but I'm not entirely sure if the syntax in your scenario would allow for it

Comments

2

I don't think there is anything wrong with running these queries separately until you find the result you want. While there are ways to combine these into one query, those end up being more complicated and slower, which isn't what you wanted.

You should run consider running all of the queries in one transaction, probably best in repeatable-read isolation level, so you get consistent results and also avoid the overhead of setting up repeated transactions. If in addition you make judicious use of prepared statements, you will have almost the same overhead as running all three queries in one combined statement.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.