I have a JSONB table in my database that looks like this:
data
-------------------------------
{
"nestedObject": {
"dynamic-key-1": 123,
"dynamic-key-2": 456,
"dynamic-key-3": 789,
"and so on": 123
},
"rest of object": "goes here"
}
-- a few million more objects come here
I'm specifically wondering if it's possible to index on all (existing) keys of data->'nestedObject' as integers. Currently (as I understand it) . I know that if I knew the keys ahead of time, I could just do something like
CREATE INDEX IF NOT EXISTS idx_gin_my_jsonb_integer_index ON table
USING BTREE (((data->'nestedObject'->>'integerKey')::integer));
but unfortunately it's not possible because I don't know the keys ahead of time (attributes of the nested object are generated at runtime based on timestamp etc.). It is possible for many nestedObjects to have the same key (ex. many objects may have data->'nestedObject'->'dynamic-key-1'), but it is not possible for a nestedObject to have the same key more than once.
The reason I want to do this is (hopefully obviously) to speed up the queries being run. Specifically, the problematic query is:
SELECT tableOne.data AS dataOne, tableTwo.data AS dataTwo FROM tableOne
JOIN tableTwo ON tableTwo.data->>'someField' = tableOne.id
WHERE tableOne.data->'nestedObject'->'dynamic-key-goes-here' IS NOT NULL
AND (tableOne.data->'nestedObject'->>'dynamic-key-goes-here')::integer > 0
ORDER BY (tableOne.data->'nestedObject'->>'dynamic-key-goes-here')::integer DESC
LIMIT 100;
Taking this second query as an example, I can do EXPLAIN ANALYZE on it. I see that it ends up doing a sequential scan (not a parallel seq scan) on ((((data -> 'nestedObject'::text) ->> 'dynamic-key-goes-here'::text))::integer > 0) from tableOne, which takes ~75% of the expected query time.
I know that this would be trivial if it was stored "normally," ie. as typical relational data (and this data is relational), but unfortunately 1. I inherited this code from someone else, and 2. I'm not able to do a database migration at this time, so I can't do this.
So given this, is it possible to effectively create an index on this data as integers?
is not nulltest is not necessary, because the> 0will also be false if the result wasnull(but removing the check won't change anything regarding the performance - it's essentially just cosmetics)