0

In my application we are using postgresql,now it has one million records in summary table.

When I run the following query it takes 80,927 ms

SELECT COUNT(*) AS count
FROM summary_views
GROUP BY question_id,category_type_id

Is there any efficient way to do this?

4
  • And the version of Postgres is...? Commented Apr 4, 2014 at 18:59
  • Anyway, I'd suggest checking this wiki page for the start, it has a couple of very useful hints. ) Commented Apr 4, 2014 at 19:00
  • ayende.com/blog/164772/the-cost-of-select-count-from-tbl count is expensive for b-trees the data-structure common relational databases use. I use the approximation pointed out in the wiki article posted by @raina77ow quite often, sometimes an additional table with a single counter. Commented Apr 4, 2014 at 19:05
  • One million records is not very many, and 80 seconds seems pretty slow to count them. But without knowing what summary_views is, or seeing an EXPLAIN (ANALYZE, BUFFERS) for the query, there is not one can say. See wiki.postgresql.org/wiki/Slow_Query_Questions Commented Apr 4, 2014 at 19:20

1 Answer 1

1

COUNT(*) in PostgreSQL tends to be slow. It's a feature of MVCC. One of the workarounds of the problem is a row counting trigger with a helper table:

create table table_count(
        table_count_id text primary key,
        rows int default 0
);

CREATE OR REPLACE FUNCTION table_count_update()
RETURNS trigger AS
$BODY$
begin
    if tg_op = 'INSERT' then
        update table_count set rows = rows + 1 
            where table_count_id = TG_TABLE_NAME;
    elsif tg_op = 'DELETE' then
        update table_count set rows = rows - 1 
            where table_count_id = TG_TABLE_NAME;
    end if;
    return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

Next step is to add proper trigger declaration for each table you'd like to use it with. For example for table tab_name:

begin;
insert into table_count values 
    ('tab_name',(select count(*) from tab_name));

create trigger tab_name_table_count after insert or delete
on tab_name for each row execute procedure table_count_update();
commit;

It is important to run in a transaction block to keep actual count and helper table in sync in case of delete or insert between initial count and trigger creation. Transaction guarantees this. From now on to get current count instantly, just invoke:

select rows from table_count where table_count_id = 'tab_name';

Edit: In case of your group by clause, you'll need more sophisticated trigger function and count table.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.