I'm running postgresql 9.4 on Centos 6.7. One of the tables contains lots of millions of records, this is the DDL:
CREATE TABLE domain.examples (
id SERIAL,
sentence VARCHAR,
product_id BIGINT,
site_id INTEGER,
time_stamp BIGINT,
category_id INTEGER,
CONSTRAINT examples_pkey PRIMARY KEY(id)
)
WITH (oids = false);
CREATE INDEX examples_categories ON domain.examples
USING btree (category_id);
CREATE INDEX examples_site_idx ON domain.examples
USING btree (site_id);
The application that consumes the data do that using pagination, so we're fetching bulks of 1000 records. However even when fetching by an indexed column, the fetch time is very slow:
explain analyze
select *
from domain.examples e
where e.category_id = 105154
order by id asc
limit 1000;
Limit (cost=0.57..331453.23 rows=1000 width=280) (actual time=2248261.276..2248296.600 rows=1000 loops=1)
-> Index Scan using examples_pkey on examples e (cost=0.57..486638470.34 rows=1468199 width=280) (actual time=2248261.269..2248293.705 rows=1000 loops=1)
Filter: (category_id = 105154)
Rows Removed by Filter: 173306740
Planning time: 70.821 ms
Execution time: 2248328.457 ms
What's causing the slow query? And how that can be improved?
Thanks!
_idcolumns supposed to be foreign keys? They don't seem to be declared as such. How big is the stuff insentence? It's possible your caches were cold, or the server's disk was overloaded. Try it again.VACUUM ANALYZE domain.examples;BTW ise.category_ida low-cardinality column?cascade on delete.CREATE UNIQUE INDEX examples_categories ON domain.examples USING btree (category_id, id);