I am trying to GROUP BY array values of a column, here is the table definition:
CREATE TABLE "public"."modifier_arrays" (
"id" INTEGER DEFAULT nextval('modifier_arrays_id_seq'::regclass) NOT NULL UNIQUE,
"product_id" INTEGER NOT NULL,
"modifier_ids" INTEGER[] NOT NULL,
PRIMARY KEY ( "id" )
);
CREATE INDEX "modifier_ids_btree" ON "public"."modifier_arrays" USING btree( "modifier_ids" ASC NULLS LAST );
CREATE INDEX "modifier_ids_gin" ON "public"."modifier_arrays" USING gin( "modifier_ids" );
I filled it up with 500K rows and here is the query that I am running:
SELECT UNNEST(modifier_ids) AS modifier_id FROM modifier_arrays WHERE '{}' <@ modifier_ids GROUP BY UNNEST(modifier_ids);
and here is the analyze explain:
HashAggregate (cost=51563.39..52068.64 rows=10000 width=43) (actual time=8705.943..8705.962 rows=101 loops=1)
-> Bitmap Heap Scan on modifier_arrays (cost=34387.54..51061.89 rows=200600 width=43) (actual time=1683.227..5771.153 rows=10998944 loops=1)
Recheck Cond: ('{}'::integer[] <@ modifier_ids)
-> Bitmap Index Scan on modifier_ids_gin (cost=0.00..34387.04 rows=2006 width=0) (actual time=1676.215..1676.215 rows=2000000 loops=1)
Index Cond: ('{}'::integer[] <@ modifier_ids)
Total runtime: 8706.327 ms
Here is what I have tried:
SET work_mem = '550MB';
SET cpu_tuple_cost = 0.1;
SET enable_seqscan = OFF;
Oh and this is my Postgres version:
PostgreSQL 9.1.14
I am still not able get it down to an acceptable performance, how can I optimize this query? I am out of ideas/google keywords :(
WHERE '{}' <@ modifier_ids? What's it supposed to do? Doesn't '{}' evaluate to the numeric value of the brace characters in this comparison? And does not '<@' mean less than the absolute value? Are there implicit or explicit connotations of these strings and operators that I am missing? And I see that you are UNNESTing, but where is the array creation happening?