0

I have a table with ~1.4 millions rows. There are about 5 columns with general info on each row and a 6th column with ~1700 JSON key value pairs.

I am building some summaries from a column called ownership by selecting rows where a specific key value exists. The query below runs in 14.5s

SELECT ownership,
SUM (TO_NUMBER(jsonfield->>'firstvalue','9G999g999')) AS total
FROM
mytable
WHERE
jsonfield->>'firstvalue' IS NOT NULL
group by ownership

My queries will be much larger and I know I'll need to make selections on many key values from the jsonfield. For example, if add another key value, the query time increased to 22.9s

SELECT ownership,
SUM (TO_NUMBER(jsonfield->>'firstvalue','9G999g999')) AS total,
SUM (TO_NUMBER(jsonfield->>'secondvalue','9G999g999')) AS totaltwo
FROM
mytable
WHERE
jsonfield->>'firstvalue' IS NOT NULL
OR
jsonfield->>'secondvalue' IS NOT NULL
group by ownership

There may be instances where I'll need to query on several hundred potential values in the jsonfield. Any suggestions on how to optimize my queries which may speed things up?

Great answer below.. As an FYI, I had to convert my json to jsonb like this before I could create the index. I first created a copy of the json column called jsonbsummary that I then converted to jsonb

ALTER TABLE mytable
  ALTER COLUMN jsonbsummary
  SET DATA TYPE jsonb
  USING jsonbsummary::jsonb;

As an additional FYI - Those queries with grouping that originally took 22+ seconds now run in 200ms with the GIN index! See below

SELECT ownership,
SUM (TO_NUMBER(jsonbsummary->>'firstvalue','9G999g999')) AS total,
SUM (TO_NUMBER(jsonbsummary->>'secondvalue','9G999g999')) AS totaltwo
FROM
mytable
WHERE
jsonbsummary ?| array['firstvalue','secondvalue']
group by ownership

1 Answer 1

3

You need a GIN index on the JSONB column.

CREATE INDEX idx_json ON mytable USING GIN (jsoncolumn);

To check for the existence of keys, you need to use the ?| operator which can make use of that index:

select ...
from mytable
where jsoncolumn ?| array['firstvalue', 'secondvalue'];

That is the equivalent to your OR condition. If you want to find rows that contain all of those keys, use the ?& instead.

Sign up to request clarification or add additional context in comments.

1 Comment

This is great! I had to convert my json to jsonb before this would run. I added that detail into my question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.