I have a simple Django site, using a PostgreSQL 9.3 database, with a single table storing user accounts (e.g. name, email, address, phone, active, etc). However, my user model is fairly large, and has around 2.6 million records. I noticed Django's admin was a little slow, so using django-debug-toolbar, I noticed that almost all queries ran in under 1 ms, except for:
SELECT COUNT(*) FROM "myapp_myuser" WHERE "myapp_myuser"."active" = true;
which took about 7000 ms. However, the active column is indexed using Django's standard db_index=True, which generates the index:
CREATE INDEX myapp_myuser_active
ON myapp_myuser
USING btree
(active);
Checking out the query with EXPLAIN via:
EXPLAIN ANALYZE VERBOSE
SELECT COUNT(*) FROM "myapp_myuser" WHERE "myapp_myuser"."active" = true;
returns:
Aggregate (cost=109305.45..109305.46 rows=1 width=0) (actual time=7342.973..7342.974 rows=1 loops=1)
Output: count(*)
-> Seq Scan on public.myapp_myuser (cost=0.00..102638.16 rows=2666916 width=0) (actual time=0.035..4765.059 rows=2666337 loops=1)
Output: id, created, category_id, name, email, address_1, address_2, city, active, (...)
Filter: myapp_myuser.active
Total runtime: 7343.031 ms
It appears to not be using the index at all. Am I reading this right?
Running just SELECT COUNT(*) FROM "myapp_myuser" completed in about 500 ms. Why such a disparity in run times, even though the only column being used is indexed?
How can I better optimize this query?