Postgres optimize query index based on aggregates

Question

I have the following Query/View:

CREATE OR REPLACE VIEW "SumAndSalesPerCountryYear" AS 
 SELECT date_part('year'::text, "Invoice"."InvoiceDate") AS year,
    "Invoice"."BillingCountry" AS country,
    sum("Invoice"."Total") AS total
   FROM "Invoice"
  GROUP BY date_part('year'::text, "Invoice"."InvoiceDate"), "Invoice"."BillingCountry"
  ORDER BY date_part('year'::text, "Invoice"."InvoiceDate") DESC, sum("Invoice"."Total") DESC;

My table structure is as follows:

CREATE TABLE "Invoice"
(
  "InvoiceId" integer NOT NULL,
  "CustomerId" integer NOT NULL,
  "InvoiceDate" timestamp without time zone NOT NULL,
  "BillingAddress" character varying(70),
  "BillingCity" character varying(40),
  "BillingState" character varying(40),
  "BillingCountry" character varying(40),
  "BillingPostalCode" character varying(10),
  "Total" numeric(10,2) NOT NULL,
  CONSTRAINT "PK_Invoice" PRIMARY KEY ("InvoiceId"),
  CONSTRAINT "FK_InvoiceCustomerId" FOREIGN KEY ("CustomerId")
      REFERENCES "Customer" ("CustomerId") MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
  OIDS=FALSE
);

The current execution plan is

Sort  (cost=33.65..34.54 rows=354 width=21) (actual time=0.691..0.698 rows=101 loops=1)"
  Sort Key: (date_part('year'::text, "Invoice"."InvoiceDate")), (sum("Invoice"."Total"))
  Sort Method: quicksort  Memory: 32kB
  ->  HashAggregate  (cost=14.24..18.67 rows=354 width=21) (actual time=0.540..0.567 rows=101 loops=1)
        ->  Seq Scan on "Invoice"  (cost=0.00..11.15 rows=412 width=21) (actual time=0.015..0.216 rows=412 loops=1)
Total runtime: 0.753 ms

My task is to optimize the query by using indices, however i cannot think of a way to use indices for optimizing aggregate results.

Indexes will rarely be helpful without a where clause. You should mention the typical use case (with another explain analyze) if you want a useful answer. Otherwise, getting the full list of all invoices in the amount of time you suggested seems reasonable. Naively, I would think an index on date_part('year'::text, "Invoice"."InvoiceDate"), BillingCountry, or some combination of the two could be useful depending on which you want to limit search results by. — Ramfjord
– Ramfjord, Commented Dec 6, 2014 at 21:21
Your query runs for less than one milli second. How fast do you need it to be? One nanosecond? — user330315
– user330315, Commented Dec 6, 2014 at 21:46
Consider using a materialized view, if you don't need a fresh data: postgresql.org/docs/9.3/static/rules-materializedviews.html The view can be refreshed every night, and it's results will include data up to previous day only, but will be very fast. — krokodilko
– krokodilko, Commented Dec 6, 2014 at 23:24

Pavel Stehule · Accepted Answer · 2014-12-07 17:28:17Z

2

You can try to penalize Hashagg by "SET enable_hashagg to OFF", but probably for small data, there will not be any benefit from index .. in this use case - hashagg is usually most fast method for aggregation and sort 32kB is pretty quick.

But .. you are trying do performance benchmark on table with 412 rows. It is nonsense. Any thinking about performance has sense on data with size related 2..3 years of production usage.

edited Dec 7, 2014 at 17:28

answered Dec 6, 2014 at 22:57

Pavel Stehule

46.6k6 gold badges103 silver badges102 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Denis de Bernardy · Accepted Answer · 2014-12-06 23:13:52Z

1

As noted by Pavel, Ramfjord, and horse, using an index is of little use with such a tiny amount of data. It's so small that it's faster for Postgres to read disk page or two and process everything in memory.

Further, you have the best possible plan for your query already. You're asking Postgres to compute an aggregate over an entire table and returning it in a certain order. Postgres proceeds by computing the aggregate in memory without bothering to sort the data first, by assigning intermediary results using a hash; it then sorts the small number of results according to your criteria.

answered Dec 6, 2014 at 23:13

Denis de Bernardy

79.1k14 gold badges138 silver badges158 bronze badges

Collectives™ on Stack Overflow

Postgres optimize query index based on aggregates

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related