PostgreSQL create index on JSONB[]

Question

Consider a table defined as follows:

CREATE TABLE test (
    id int4 NOT NULL,
    tag_counts _jsonb NOT NULL DEFAULT ARRAY[]::jsonb[]
);

INSERT INTO test(id, tag_counts) values(1,array['{"type":1, "count":4}','{"type":2, "count":10}' ]::jsonb[])

How can I create an index on json key type and how can I query on it?

Edit: Previously, there were no indexes on json keys and select queries used an unnest operation as shown below:

select * from (SELECT unnest(tag_counts) as tc
FROM public.test) as t
where tc->'type' = '2';

The problem is, if the table has a large number of rows, the above query will not only include a full table scan, but also filtering through each jsonb array.

jsonb[] almost never makes sense. Why not store a proper JSON array in a jsonb column? e.g. '[{"type":1, "count":4}','{"type":2, "count":10}]' — user330315
– user330315, Commented Feb 10, 2021 at 17:17
Legacy issues. Can't really change it without major refactoring. — MojoJojo
– MojoJojo, Commented Feb 10, 2021 at 17:50
So how was it indexed in the legacy system? And what query are you trying to speed up? — jjanes
– jjanes, Commented Feb 10, 2021 at 18:22

user330315 · Accepted Answer · 2021-02-10 21:36:10Z

There is a way to index this, not sure how fast it will be.

If that was a "regular" jsonb column, you could use a condition like where tag_counts @> '[{"type": 2}]' which can use a GIN index on the column.

You can use that operator if you convert the array to "plain" json value:

select *
from test
where to_jsonb(tag_counts) @> '[{"type": 2}]'

Unfortunately, to_jsonb() is not marked as immutable (I guess because of potential timestamp conversion in there) which is a requirement if you want to use an expression in an index.

But for your data, this is indeed immutable, so we can create a little wrapper function:

create function as_jsonb(p_input jsonb[])
returns  jsonb
as
$$
  select to_jsonb(p_input);
$$
language sql
immutable;

And with that function we can create an index:

create index on test using gin ( as_jsonb(tag_counts) jsonb_path_ops);

You will need to use that function in your query:

select *
from test
where as_jsonb(tag_counts) @> '[{"type": 2}]'

On a table with a million rows, I get the following execution plan:

Bitmap Heap Scan on stuff.test  (cost=1102.62..67028.01 rows=118531 width=252) (actual time=15.145..684.062 rows=147293 loops=1)
  Output: id, tag_counts
  Recheck Cond: (as_jsonb(test.tag_counts) @> '[{"type": 2}]'::jsonb)
  Heap Blocks: exact=25455
  Buffers: shared hit=25486
  ->  Bitmap Index Scan on ix_test  (cost=0.00..1072.99 rows=118531 width=0) (actual time=12.347..12.356 rows=147293 loops=1)
        Index Cond: (as_jsonb(test.tag_counts) @> '[{"type": 2}]'::jsonb)
        Buffers: shared hit=31
Planning:
  Buffers: shared hit=23
Planning Time: 0.444 ms
Execution Time: 690.160 ms

Sheer genius. It improved the query time for me significantly. Thanks for the help.

Collectives™ on Stack Overflow

PostgreSQL create index on JSONB[]

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related