30

I used to have a query like in Rails:

MyModel.where(id: ids) 

Which generates sql query like:

SELECT "my_models".* FROM "my_models" WHERE "my_models"."id" IN (1, 28, 7, 8, 12) 

Now I want to change this to use ANY instead of IN. I created this:

MyModel.where("id = ANY(VALUES(#{ids.join '),('}))" 

Now when I use empty array ids = [] I get the folowing error:

MyModel Load (53.0ms) SELECT "my_models".* FROM "my_models" WHERE (id = ANY(VALUES())) ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")" ActiveRecord::StatementInvalid: ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")" Position: 75: SELECT "social_messages".* FROM "social_messages" WHERE (id = ANY(VALUES())) from arjdbc/jdbc/RubyJdbcConnection.java:838:in `execute_query' 
9
  • 1
    If you're going to write custom queries, please be very careful to use placeholders: VALUES(?) expanded as necessary with an array to bind is way better than what you have here. You need to be careful to properly escape any raw values being injected into your SQL. What's the purpose of this query? Commented Jul 2, 2015 at 18:24
  • in my real query i have so many IDs at the IN part so i want to optimise it using ANY in this case. Commented Jul 2, 2015 at 18:31
  • 2
    In PostgreSQL, IN is an alias for = ANY Commented Jul 2, 2015 at 18:34
  • 1
    Where are all these ids coming from? Maybe you should be JOINing or using a subquery instead of sending a big list of ids to the database. Commented Jul 2, 2015 at 18:53
  • 2
    @PinnyM: IN is not an alias for = ANY. I provided details. Commented Jul 2, 2015 at 19:21

1 Answer 1

73

There are two variants of IN expressions:

Similarly, two variants with the ANY construct:

A subquery works for either technique, but for the second form of each, IN expects a list of values (as defined in standard SQL) while = ANY expects an array.

Which to use?

ANY is a later, more versatile addition, it can be combined with any binary operator returning a boolean value. IN burns down to a special case of ANY. In fact, its second form is rewritten internally:

IN is rewritten with = ANY
NOT IN is rewritten with <> ALL

Check the EXPLAIN output for any query to see for yourself. This proves two things:

  • IN can never be faster than = ANY.
  • = ANY is not going to be substantially faster.

The choice should be decided by what's easier to provide: a list of values or an array (possibly as array literal - a single value).

If the IDs you are going to pass come from within the DB anyway, it is much more efficient to select them directly (subquery) or integrate the source table into the query with a JOIN (like @mu commented).

To pass a long list of values from your client and get the best performance, use an array, unnest() and join, or provide it as table expression using VALUES (like @PinnyM commented). But note that a JOIN preserves possible duplicates in the provided array / set while IN or = ANY do not. More:

In the presence of NULL values, NOT IN is often the wrong choice and NOT EXISTS would be right (and faster, too):

Syntax for = ANY

For the array expression Postgres accepts:

  • an array constructor (array is constructed from a list of values on the Postgres side) of the form: ARRAY[1,2,3]
  • or an array literal of the form '{1,2,3}'.

To avoid invalid type casts, you can cast explicitly:

ARRAY[1,2,3]::numeric[] '{1,2,3}'::bigint[] 

Related:

Or you could create a Postgres function taking a VARIADIC parameter, which takes individual arguments and forms an array from them:

How to pass the array from Ruby?

Assuming id to be integer:

MyModel.where('id = ANY(ARRAY[?]::int[])', ids.map { |i| i}) 

But I am just dabbling in Ruby. @mu provides detailed instructions in this related answer:

Sign up to request clarification or add additional context in comments.

10 Comments

thanks man, this is WAY better :) aw how would I implement this in rails and activemodel? Thanks
Your solution working perfect except when I use empty array I get: ActiveRecord::StatementInvalid (ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: operator does not exist: integer = text Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.
@EkiEqbal as Erwin stated above, add ::numeric[] after the Array to avoid this problem.
@ErwinBrandstetter considering that OP was trying to improve performance by using value lists instead of Arrays (on the assumption that the linked article in above comments still holds correct), the answer should use the format VALUES (...), (...), ... instead of ARRAY[...]. Or did I miss something? And if the article is no longer correct (or never was), then there doesn't really seem to be a point to this exercise...
For those that come by this answer like I did and find the discussion very useful - the datadog article linked by @PinnyM has changed to the following: datadoghq.com/blog/…
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.