0

I have table with a jsonb field and Postgres 12.1

create table market (
    user int primary key,
    base jsonb
);

the jsonb field has following structure

{
    "a": [1, 2, 3],
    "regions": [
        {
            "id": 1,
            "name": "name",
            "description": "description",
            "shops": [
                {
                    "id": 11,
                    "brands": [
                        {
                            "name": 22,
                            "id": 21
                        }
                    ]
                }
            ]
        }
    ]
}

Our cliens choose a shop and then the brands. This chosing are stored in the market table. A user does his chose {shopId: 1, brands[1, 2, 3]}. I search how often the user chose this shop and this brands. In result i expected region_id, region_name, shop_id, count_of_using_shop_id, brand_id, count_of_using_brand_id

I have legacy the market table with several millions rows. I do not work with jsonb earlier and i am confused about that deep nested structure.

I think do this with using group by, but after several experiments i rejected this solution. Results are a very large number and group by operator performs costly sort.

Can you help me and point basic direction to solve my problem? Can it be more fastly directly with python with no sql? Of course i will be select all data to analyze with plain sql and then will be filtering shops and brands with python?

1 Answer 1

0

That's a terrible, terrible data model.

To query that efficiently, you need a GIN index on the column:

CREATE INDEX ON market USING gin (base);

Then, to find all rows that match {shopId: 1, brands[1, 2, 3]}, you'd have to use the containment operator @> and split the OR that in implied by the array into UNION ALL queries:

SELECT * FROM market
WHERE data @> '{ "regions": [ { "shops": [ { "id": 1, "brands": [ { "id": 1 } ] } ] } ] }'
UNION ALL
SELECT * FROM market
WHERE data @> '{ "regions": [ { "shops": [ { "id": 1, "brands": [ { "id": 2 } ] } ] } ] }'
UNION ALL
SELECT * FROM market
WHERE data @> '{ "regions": [ { "shops": [ { "id": 1, "brands": [ { "id": 3 } ] } ] } ] }';
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your faster feedback. Can you offer how can i reorginize my jsonb field structure to boost performance
You wouldn't use JSON at all. You'd have several tables like region, shop and brand, and each array element would become one table row. The relationships between the tables are foreign key constraints.
Thank you very much. Your answer was useful for me

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.