Postgres JSONB - index on all dynamic (integer) attributes of subobject

Question

I have a JSONB table in my database that looks like this:

             data
-------------------------------
{
  "nestedObject": {
    "dynamic-key-1": 123,
    "dynamic-key-2": 456,
    "dynamic-key-3": 789,
    "and so on": 123
  },
  "rest of object": "goes here"
}
-- a few million more objects come here

I'm specifically wondering if it's possible to index on all (existing) keys of data->'nestedObject' as integers. Currently (as I understand it) . I know that if I knew the keys ahead of time, I could just do something like

CREATE INDEX IF NOT EXISTS idx_gin_my_jsonb_integer_index ON table 
    USING BTREE (((data->'nestedObject'->>'integerKey')::integer));

but unfortunately it's not possible because I don't know the keys ahead of time (attributes of the nested object are generated at runtime based on timestamp etc.). It is possible for many nestedObjects to have the same key (ex. many objects may have data->'nestedObject'->'dynamic-key-1'), but it is not possible for a nestedObject to have the same key more than once.

The reason I want to do this is (hopefully obviously) to speed up the queries being run. Specifically, the problematic query is:

SELECT tableOne.data AS dataOne, tableTwo.data AS dataTwo FROM tableOne
    JOIN tableTwo ON tableTwo.data->>'someField' = tableOne.id
    WHERE tableOne.data->'nestedObject'->'dynamic-key-goes-here' IS NOT NULL
        AND (tableOne.data->'nestedObject'->>'dynamic-key-goes-here')::integer > 0
    ORDER BY (tableOne.data->'nestedObject'->>'dynamic-key-goes-here')::integer DESC 
LIMIT 100;

Taking this second query as an example, I can do EXPLAIN ANALYZE on it. I see that it ends up doing a sequential scan (not a parallel seq scan) on ((((data -> 'nestedObject'::text) ->> 'dynamic-key-goes-here'::text))::integer > 0) from tableOne, which takes ~75% of the expected query time.

I know that this would be trivial if it was stored "normally," ie. as typical relational data (and this data is relational), but unfortunately 1. I inherited this code from someone else, and 2. I'm not able to do a database migration at this time, so I can't do this.

So given this, is it possible to effectively create an index on this data as integers?

Unrelated, but: the is not null test is not necessary, because the > 0 will also be false if the result was null (but removing the check won't change anything regarding the performance - it's essentially just cosmetics) — user330315
– user330315, Commented Jul 29, 2018 at 7:18

Papipo · Accepted Answer · 2021-08-18 09:48:36Z

2

If the key you are looking for is only present in a (relatively) small number of values, then it might be possible to filter those out using the ? ("exists) operator. That operator can use an index on a JSONB value.

e.g.:

create index on the_table using gin (data  -> 'nestedObject');

And use a condition like:

where data->'nestedObject' ? 'dynamic-key-1' -- this could use the index if feasible
  and (data->'nestedObject'->> 'dynamic-key-1')::integer > 100

However this won't really help if that key is present in the majority of "nestedObjects".

If you were looking for one specific value (e.g. dynamic-key = 123) this could be supported using a GIN index and the @> operator, e.g. where data @> '{"nestedObject" : {"dynamic-key-1": 123}}' but as you are comparing the value using > this is really hard to index.

edited Aug 18, 2021 at 9:48

Papipo

2,8112 gold badges25 silver badges29 bronze badges

answered Jul 29, 2018 at 7:14

user330315

Sign up to request clarification or add additional context in comments.

1 Comment

user7876637 Over a year ago

"If the key you are looking for is only present in a (relatively) small number of values" Yes! The keys being looked for will only be present in <=1% of all nestedObjects, and using a GIN index + the ? operator cut query time down significantly (~3500ms to <900ms, according to EXPLAIN ANALYZE). Thank you!

Collectives™ on Stack Overflow

Postgres JSONB - index on all dynamic (integer) attributes of subobject

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related