Slow PostgreSQL query not using index

Question

I have a simple Django site, using a PostgreSQL 9.3 database, with a single table storing user accounts (e.g. name, email, address, phone, active, etc). However, my user model is fairly large, and has around 2.6 million records. I noticed Django's admin was a little slow, so using django-debug-toolbar, I noticed that almost all queries ran in under 1 ms, except for:

SELECT COUNT(*) FROM "myapp_myuser" WHERE "myapp_myuser"."active" = true;

which took about 7000 ms. However, the active column is indexed using Django's standard db_index=True, which generates the index:

CREATE INDEX myapp_myuser_active
  ON myapp_myuser
  USING btree
  (active);

Checking out the query with EXPLAIN via:

EXPLAIN ANALYZE VERBOSE
SELECT COUNT(*) FROM "myapp_myuser" WHERE "myapp_myuser"."active" = true;

returns:

Aggregate  (cost=109305.45..109305.46 rows=1 width=0) (actual time=7342.973..7342.974 rows=1 loops=1)
  Output: count(*)
  ->  Seq Scan on public.myapp_myuser  (cost=0.00..102638.16 rows=2666916 width=0) (actual time=0.035..4765.059 rows=2666337 loops=1)
        Output: id, created, category_id, name, email, address_1, address_2, city, active,  (...)
        Filter: myapp_myuser.active
Total runtime: 7343.031 ms

It appears to not be using the index at all. Am I reading this right?

Running just SELECT COUNT(*) FROM "myapp_myuser" completed in about 500 ms. Why such a disparity in run times, even though the only column being used is indexed?

How can I better optimize this query?

It's not using the index. Are there any rows where "active" is false? — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented Apr 28, 2014 at 1:35
How reproducible is this? I would guess the query without the where clause is faster only because you ran it immediately after the query with the where clause already pulled all the data into memory. — jjanes
– jjanes, Commented Apr 28, 2014 at 17:07

Mike Sherrill 'Cat Recall' · Accepted Answer · 2014-04-28 02:00:40Z

2

You're selecting a lot of columns out of a wide table. So this might not help, even though it does result in a bitmap index scan.

Try a partial index.

create index on myapp_myuser (active) where active = true;

I made a test table with a couple million rows.

explain analyze verbose 
select count(*) from test where active = true;

"Aggregate  (cost=41800.79..41800.81 rows=1 width=0) (actual time=500.756..500.756 rows=1 loops=1)"
"  Output: count(*)"
"  ->  Bitmap Heap Scan on public.test  (cost=8085.76..39307.79 rows=997200 width=0) (actual time=126.233..386.834 rows=1000000 loops=1)"
"        Output: id, active"
"        Filter: test.active"
"        ->  Bitmap Index Scan on test_active_idx1  (cost=0.00..7836.45 rows=497204 width=0) (actual time=123.398..123.398 rows=1000000 loops=1)"
"              Index Cond: (test.active = true)"
"Total runtime: 500.794 ms"

When you write queries that you hope will use a partial index, you need to match the expression and WHERE clause. Using WHERE active is true is valid in PostgreSQL, but it doesn't match the WHERE clause in the partial index. That means you'll get a sequential scan again.

edited Apr 28, 2014 at 2:00

answered Apr 28, 2014 at 1:55

Mike Sherrill 'Cat Recall'

96.6k20 gold badges134 silver badges195 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Cerin Over a year ago

Thanks. I didn't get 500ms, but I did get 2000ms, which is a bit better than 7000ms.

Collectives™ on Stack Overflow

Slow PostgreSQL query not using index

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related