0

This there a performance hit on creating a index on a jsonb column? Reason is I have a very large data set which I want to query. the set up is that my java models (very large) are stored as blobs in a jsonb column in my database. Now I want to pull reports based on conditions on the data inside the payload data. So I want to know whether there will be a performance hit on my prod db when trying to pull these stats? Would adding an index help with this query?

CREATE VIEW Reporting_View AS
Select dms.key AS "Id",
    dms.data_model_type AS Data_Model_Type,
    dms.short_type_name AS Short_Type_Name,
    dms.version AS "Version",
    dms.payload_data -> 'requestData' ->> 'modelType' AS Model_Type,
    dms.payload_data -> 'requestData' ->> 'modelName' AS Model_Name,
    dms.payload_data -> 'requestData' ->> 'modelVersion' AS Model_Version,
    dms.payload_data -> 'requestData' -> 'arguments' ->> 'age' AS Age,
    dms.payload_data -> 'requestData' -> 'arguments' ->> 'department' AS Department,
    dms.payload_data -> 'requestData' -> 'arguments' ->> 'income' AS Income,
    dms.payload_data -> 'output' ->> 'output' AS FinalResult
From myschema.data_model_storage dms;
3
  • You'd have to share the query and give us an idea about the structure of the JSON and the table that holds it. Commented Oct 21, 2020 at 8:01
  • In general queries using e.g. the @> operator can use GIN indexes on the column. Indexes are created to support queries, so without the query, the question can't really be answered. Is there a performance hit creating such an index? Yes, INSERTs, DELETEs and UPDATEs will be slower - how much slower depends on a lot of factors. Commented Oct 21, 2020 at 8:06
  • I have updated my question with the query. I have tried to keep it as simple as possible but it reality I have many more fields in 500 plus maybe, but will only be reporting on the ones that add value Commented Oct 21, 2020 at 10:34

1 Answer 1

1

Your view doesn't have a WHERE clause, so no index can improve it directly. But being a view, you can add a WHERE clause when you access it, rather then just when you define it. But your view doesn't expose payload_data itself, so there is no way for a user of the view to get their hands on that to use an index on it. You could define expression indexes on the same expressions used in the definition of the view, for example ((payload_data -> 'requestData' ->> 'modelType')), and those would be usable from the view by testing the column alias assigned by the view.

And yes, maintaining indexes does have a cost.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.