optimizing UNNEST intarray with GROUP BY for postgres

Question

I am trying to GROUP BY array values of a column, here is the table definition:

CREATE TABLE "public"."modifier_arrays" ( 
   "id" INTEGER DEFAULT nextval('modifier_arrays_id_seq'::regclass) NOT NULL UNIQUE, 
   "product_id" INTEGER NOT NULL, 
   "modifier_ids" INTEGER[] NOT NULL,
   PRIMARY KEY ( "id" )
);
CREATE INDEX "modifier_ids_btree" ON "public"."modifier_arrays" USING btree( "modifier_ids" ASC NULLS LAST );
CREATE INDEX "modifier_ids_gin" ON "public"."modifier_arrays" USING gin( "modifier_ids" );

I filled it up with 500K rows and here is the query that I am running:

SELECT UNNEST(modifier_ids) AS modifier_id FROM modifier_arrays WHERE '{}' <@ modifier_ids   GROUP BY UNNEST(modifier_ids);

and here is the analyze explain:

HashAggregate (cost=51563.39..52068.64 rows=10000 width=43) (actual time=8705.943..8705.962 rows=101 loops=1)
   -> Bitmap Heap Scan on modifier_arrays (cost=34387.54..51061.89 rows=200600 width=43)         (actual time=1683.227..5771.153 rows=10998944 loops=1)
   Recheck Cond: ('{}'::integer[] <@ modifier_ids)
   -> Bitmap Index Scan on modifier_ids_gin (cost=0.00..34387.04 rows=2006 width=0) (actual time=1676.215..1676.215 rows=2000000 loops=1)
   Index Cond: ('{}'::integer[] <@ modifier_ids)
   Total runtime: 8706.327 ms

Here is what I have tried:

SET work_mem = '550MB';
SET cpu_tuple_cost = 0.1;
SET enable_seqscan = OFF;

Oh and this is my Postgres version:

PostgreSQL 9.1.14

I am still not able get it down to an acceptable performance, how can I optimize this query? I am out of ideas/google keywords :(

Could you explain what you're trying to achieve with these actions? — vyegorov
– vyegorov, Commented Oct 19, 2014 at 8:30
I don't get the query. Why are you using WHERE '{}' <@ modifier_ids? What's it supposed to do? Doesn't '{}' evaluate to the numeric value of the brace characters in this comparison? And does not '<@' mean less than the absolute value? Are there implicit or explicit connotations of these strings and operators that I am missing? And I see that you are UNNESTing, but where is the array creation happening? — bf2020
– bf2020, Commented Oct 19, 2014 at 13:50
@vyegorov: I am trying to GROUP BY the elements of the array, for ex: 1 | {3, 4} 2 | {4, 5} I want to have a result like: (4, 2) (3, 1) (5, 1) — Wahyu
– Wahyu, Commented Oct 19, 2014 at 21:41
@bf2020: As the structure of the table says, the modifier_ids is an intarray, that is the array and WHERE '{}' <@ modifier_ids is just an empty query that matches all of the array. If I want to match specific arrays I would then change the query to '{1,3}' <@ modifier_ids — Wahyu
– Wahyu, Commented Oct 19, 2014 at 21:44
@bf2020: The PostgreSQL array operator "<@" means "is contained by". — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented Oct 20, 2014 at 12:02

Wahyu · Accepted Answer · 2014-10-20 21:43:34Z

1

I found the issue, after doing a lot of inserts/updates to the table, I ran the query and it was really slow, what I needed to do was VACUUM ANALYZE the table. There's an AUTOVACUUM settings somewhere that I missed ...

answered Oct 20, 2014 at 21:43

Wahyu

1033 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Clodoaldo Neto · Accepted Answer · 2014-10-20 18:11:02Z

0

Since you are not doing anything with the aggregate you can just select distinct

select distinct unnest(modifier_ids) as modifier_id
from modifier_arrays
where '{}' <@ modifier_ids;

answered Oct 20, 2014 at 18:11

Clodoaldo Neto

127k30 gold badges251 silver badges274 bronze badges

Collectives™ on Stack Overflow

optimizing UNNEST intarray with GROUP BY for postgres

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related