5

I try to set up a GIN index but I do not think my index is used when I run the request, whether I use an operator or a function.

Environment

In our table we have a JSONB field (json_aip) containing a Json that looks like that:

{ "properties": { "pdi": { "contextInformation": { "tags": ["SOME_TAG"] }, }, } 

Table creation :

create table t_aip ( json_aip jsonb, [...] ); CREATE INDEX idx_aip_tags ON t_aip USING gin ((json_aip -> 'properties' -> 'pdi' -> 'contextInformation' -> 'tags')); 

Operator query

We can't use the operator ?| as we use JDBC. But rumors indicate I should see my index when I run that type of query.

EXPLAIN ANALYZE SELECT count(*) FROM storage.t_aip WHERE json_aip#>'{properties,pdi,contextInformation,tags}' ?| array['SOME_TAG'] 

Result:

 Aggregate (cost=27052.16..27052.17 rows=1 width=8) (actual time=488.085..488.087 rows=1 loops=1) -> Seq Scan on t_aip (cost=0.00..27052.06 rows=42 width=0) (actual time=0.134..456.978 rows=16502 loops=1) Filter: ((json_aip #> '{properties,pdi,contextInformation,tags}'::text[]) ?| '{SOME_TAG}'::text[]) Rows Removed by Filter: 17511 Planning time: 23.202 ms Execution time: 488.449 ms 

Functional query

EXPLAIN ANALYZE SELECT count(*) FROM storage.t_aip WHERE jsonb_exists_any( json_aip#>'{properties,pdi,contextInformation,tags}', array['SOME_TAG'] ) 

Result:

QUERY PLAN Aggregate (cost=27087.00..27087.01 rows=1 width=8) (actual time=369.931..369.933 rows=1 loops=1) -> Seq Scan on t_aip (cost=0.00..27052.06 rows=13979 width=0) (actual time=0.173..350.437 rows=16502 loops=1) Filter: jsonb_exists_any((json_aip #> '{properties,pdi,contextInformation,tags}'::text[]), '{SOME_TAG}'::text[]) Rows Removed by Filter: 17511 Planning time: 56.021 ms Execution time: 370.252 ms 

There is nothing about the index at all. Any help would be much appreciated !

I think my index is wrong because it considers that at the end of the path json_aip -> 'properties' -> 'pdi' -> 'contextInformation' -> 'tags' it index a String whether that's an array. That's my opinion.

0

2 Answers 2

5
+50

There is a general rule that you have to use exactly the same expression both in an index and a query to use the index. With this index:

CREATE INDEX idx_aip_tags ON t_aip USING gin ((json_aip#>'{properties,pdi,contextInformation,tags}')); 

the query will use the index

EXPLAIN ANALYZE SELECT count(*) FROM t_aip WHERE json_aip#>'{properties,pdi,contextInformation,tags}' ?| array['SOME_TAG'] QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=149.97..149.98 rows=1 width=0) (actual time=27.783..27.783 rows=1 loops=1) -> Bitmap Heap Scan on t_aip (cost=20.31..149.87 rows=40 width=0) (actual time=1.504..25.726 rows=20000 loops=1) Recheck Cond: ((json_aip #> '{properties,pdi,contextInformation,tags}'::text[]) ?| '{SOME_TAG}'::text[]) Heap Blocks: exact=345 -> Bitmap Index Scan on idx_aip_tags (cost=0.00..20.30 rows=40 width=0) (actual time=1.455..1.455 rows=20000 loops=1) Index Cond: ((json_aip #> '{properties,pdi,contextInformation,tags}'::text[]) ?| '{SOME_TAG}'::text[]) 

Note that the GIN index supports also @> operator:

SELECT count(*) FROM t_aip WHERE json_aip#>'{properties,pdi,contextInformation,tags}' @> '["SOME_TAG"]' 

but be careful when searching for more than one tag:

SELECT count(*) FROM t_aip -- this gives objects containing both tags: -- WHERE json_aip#>'{properties,pdi,contextInformation,tags}' @> '["SOME_TAG", "ANOTHER_TAG"]' -- and this gives objects with any of two tags: WHERE json_aip#>'{properties,pdi,contextInformation,tags}' @> ANY(ARRAY['["SOME_TAG"]', '["ANOTHER_TAG"]']::jsonb[]) 
Sign up to request clarification or add additional context in comments.

1 Comment

This is so helpfull ! And this is quite strange but interesting to use the @> with the ANY keyword to get the same behavior than ?|
1

EDITED:

I thought the opposite, but in fact THERE IS a difference between operators (?|) and functions (jsonb_exists_any) about index usage, as the index is never used when the query uses (jsonb) functions.

You can get more about it here: https://dba.stackexchange.com/a/91007

That was the other question of this topic.

EDIT2:

You can create function alias that can use index and be used as functions in your code, like this :

-- Define functions that calls the postgres native operator, to overpass the JDBC issue related to question mark CREATE OR REPLACE FUNCTION rs_jsonb_exists_all(jsonb, text[]) RETURNS bool AS 'SELECT $1 ?& $2' LANGUAGE sql IMMUTABLE; CREATE OR REPLACE FUNCTION rs_jsonb_exists(jsonb, text) RETURNS bool AS 'SELECT $1 ? $2' LANGUAGE sql IMMUTABLE; CREATE OR REPLACE FUNCTION rs_jsonb_exists_any(jsonb, text[]) RETURNS bool AS 'SELECT $1 ?| $2' LANGUAGE sql IMMUTABLE; 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.