1

I have a partitioned table and created a unique index on it.

I am trying to run some queries, some of these using primary key constraint and some using my created index. I want my queries to use unique index instead of primary constraint.

I tried reindexing, didn't work.

Here are two queries

1) Here my created index is getting used.

Query plan is :

Finalize Aggregate  (cost=296958.94..296958.95 rows=1 width=8) (actual time=927.948..927.948 rows=1 loops=1)
->  Gather  (cost=296958.72..296958.93 rows=2 width=8) (actual time=927.887..933.730 rows=3 loops=1)
     Workers Planned: 2
     Workers Launched: 2
     ->  Partial Aggregate  (cost=295958.72..295958.73 rows=1 width=8) actual time=924.885..924.885 rows=1 loops=3)
           ->  Parallel Append  (cost=0.68..293370.57 rows=1035261 width=8) (actual time=0.076..852.758 rows=825334 loops=3)
                 ->  Parallel Index Only Scan using testdate2019jan_april_cost_mo_user_id_account_id_resource__idx5 on testdate2019jan_april_cost_mod3rem2  (cost=0.68..146591.56 rows=525490 width=8) (actual time=0.082..388.130 rows=421251 loops=3)
                       Index Cond: (user_id = 1)
                       Heap Fetches: 3922
                 ->  Parallel Index Only Scan using testdate2018sept_dec_cost_mod_user_id_account_id_resource__idx5 on testdate2018sept_dec_cost_mod3rem2  (cost=0.68..141570.15 rows=509767 width=8) (actual time=0.057..551.572 rows=606125 loops=2)
                       Index Cond: (user_id = 1)
                       Heap Fetches: 0
                 ->  Parallel Index Scan using testdate2018jan_april_cost_mo_account_id_user_id_resource__idx2 on testdate2018jan_april_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=1)
                       Index Cond: (user_id = 1)
                 ->  Parallel Index Scan using testdate2018may_august_cost_m_account_id_user_id_resource__idx1 on testdate2018may_august_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=1)
                       Index Cond: (user_id = 1)
                 ->  Parallel Index Scan using testdate2019may_august_cost_m_account_id_user_id_resource__idx2 on testdate2019may_august_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8)    (actual time=0.002..0.002 rows=0 loops=1)
                       Index Cond: (user_id = 1)
                 ->  Parallel Index Scan using testdate2019sept_dec_cost_mod_account_id_user_id_resource__idx2 on testdate2019sept_dec_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)
                       Index Cond: (user_id = 1)
 Planning Time: 0.754 ms
 Execution Time: 933.797 ms

In the above query my index testdate2018may_august_cost_m_account_id_user_id_resource__idx1 is used like I want.

2) Here my created index is not getting used instead primary constraint is getting used.

 Sort Method: quicksort  Memory: 25kB
 Buffers: shared hit=2 read=66080
 ->  Finalize GroupAggregate  (cost=388046.40..388187.55 rows=6 width=61) (actual time=510.710..513.262 rows=10 loops=1)
     Group Key: c_1.instance_type, c_1.currency
     Buffers: shared hit=2 read=66080
     ->  Gather Merge  (cost=388046.40..388187.24 rows=12 width=85) (actual time=510.689..513.303 rows=28 loops=1)
           Workers Planned: 2
           Workers Launched: 2
           Buffers: shared hit=26 read=206407
           ->  Partial GroupAggregate  (cost=387046.38..387185.83 rows=6 width=85) (actual time=504.731..507.277 rows=9 loops=3)
                 Group Key: c_1.instance_type, c_1.currency
                 Buffers: shared hit=26 read=206407
                 ->  Sort  (cost=387046.38..387056.71 rows=4130 width=36) (actual time=504.694..504.933 rows=3895 loops=3)
                       Sort Key: c_1.instance_type, c_1.currency
                       Sort Method: quicksort  Memory: 404kB
                       Worker 0:  Sort Method: quicksort  Memory: 354kB
                       Worker 1:  Sort Method: quicksort  Memory: 541kB
                       Buffers: shared hit=20 read=206407
                       ->  Parallel Append  (cost=0.13..386798.33 rows=4130 width=36) (actual time=0.081..501.720 rows=3895 loops=3)
                             Buffers: shared hit=6 read=206405
                             Subplans Removed: 3
                             ->  Parallel Index Scan using testdate2019may_august_cost_mod3rem2_pkey on testdate2019may_august_cost_mod3rem2 c_1  (cost=0.13..8.15 rows=1 width=36) (actual time=0.008..0.008 rows=0 loops=1)
                                   Index Cond: ((usage_start_date >= (CURRENT_DATE - 30)) AND (user_id = '1'::bigint))
                                   Filter: ((instance_type IS NOT NULL) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE))
                                   Buffers: shared hit=1
                             ->  Parallel Index Scan using testdate2019sept_dec_cost_mod3rem2_pkey on testdate2019sept_dec_cost_mod3rem2 c_2  (cost=0.13..8.15 rows=1 width=36) (actual time=0.006..0.006 rows=0 loops=1)
                                   Index Cond: ((usage_start_date >= (CURRENT_DATE - 30)) AND (user_id = '1'::bigint))
                                   Filter: ((instance_type IS NOT NULL) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE))
                                   Buffers: shared hit=1
                             ->  Parallel Seq Scan on testdate2019jan_april_cost_mod3rem2 c  (cost=0.00..258266.58 rows=4125 width=36) (actual time=0.076..501.060 rows=3895 loops=3)
                                   Filter: ((instance_type IS NOT NULL) AND (user_id = '1'::bigint) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE) AND (usage_start_date >= (CURRENT_DATE - 30)))
                                   Rows Removed by Filter: 1504689
                                   Buffers: shared hit=4 read=206405
Planning Time: 1.290 ms
Execution Time: 513.439 ms

In above query testdate2019sept_dec_cost_mod3rem2_pkey, which is the primary constraint, is getting used.

I want it to use my created index instead of primary constraint. Is my 2nd query plan is correct according to partition?

Table Creation Queries:

CREATE TABLE a2i.testawscost_line_item (
    line_item_id uuid NOT NULL,
    account_id character varying(255) COLLATE pg_catalog."default",
    availability_zone character varying(255) COLLATE pg_catalog."default",
    base_cost double precision,
    base_rate double precision,
    cost double precision,
    currency character varying(255) COLLATE pg_catalog."default",
    instance_family character varying(255) COLLATE pg_catalog."default",
    instance_type character varying(255) COLLATE pg_catalog."default",
    line_item_type character varying(255) COLLATE pg_catalog."default",
    operating_system character varying(255) COLLATE pg_catalog."default",
    operation character varying(255) COLLATE pg_catalog."default",
    payer_account_id character varying(255) COLLATE pg_catalog."default",
    product_code character varying(255) COLLATE pg_catalog."default",
    product_family character varying(255) COLLATE pg_catalog."default",
    product_group character varying(255) COLLATE pg_catalog."default",
    product_name character varying(255) COLLATE pg_catalog."default",
    rate double precision,
    rate_description character varying(255) COLLATE pg_catalog."default",
    reservation_id character varying(255) COLLATE pg_catalog."default",
    resource_type character varying(255) COLLATE pg_catalog."default",
    sku character varying(255) COLLATE pg_catalog."default",
    tax_type character varying(255) COLLATE pg_catalog."default",
    unit character varying(255) COLLATE pg_catalog."default",
    usage_end_date timestamp without time zone,
    usage_quantity double precision,
    usage_start_date timestamp without time zone,
    usage_type character varying(255) COLLATE pg_catalog."default",
    user_id bigint,
    resource_id character varying(255) COLLATE pg_catalog."default",
    CONSTRAINT testawscost_line_item_pkey PRIMARY KEY 
        (line_item_id, usage_start_date, user_id),
    CONSTRAINT fkptp4hyur3i4yj88wo3rxnaf05 FOREIGN KEY (resource_id)
        REFERENCES a2i.awscost_resource (resource_id) MATCH SIMPLE
            ON UPDATE NO ACTION
            ON DELETE NO ACTION
) PARTITION BY hash(user_id);

The partitions:

create table a2i.testuser_cost_mod3rem0
    partition of a2i.testawscost_line_item
        for values with (MODULUS 3, REMAINDER 0) 
    partition by range(usage_start_date);

create table a2i.testuser_cost_mod3rem1
    partition of a2i.testawscost_line_item
        for values with (MODULUS 3, REMAINDER 1) 
    partition by range(usage_start_date);

create table a2i.testuser_cost_mod3rem2
    partition of a2i.testawscost_line_item
        for values with (MODULUS 3, REMAINDER 2)
    partition by range(usage_start_date);

Partitions of the partitions for 2019:

create table a2i.testdate2019jan_april_cost_mod3rem0
    partition of a2i.testuser_cost_mod3rem0
        for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');

create table a2i.testdate2019may_august_cost_mod3rem0
    partition of a2i.testuser_cost_mod3rem0
        for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');

create table a2i.testdate2019sept_dec_cost_mod3rem0
    partition of a2i.testuser_cost_mod3rem0
        for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');

create table a2i.testdate2019jan_april_cost_mod3rem1
    partition of a2i.testuser_cost_mod3rem1
        for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');

create table a2i.testdate2019may_august_cost_mod3rem1
    partition of a2i.testuser_cost_mod3rem1
        for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');

create table a2i.testdate2019sept_dec_cost_mod3rem1
    partition of a2i.testuser_cost_mod3rem1
        for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');

create table a2i.testdate2019jan_april_cost_mod3rem2
    partition of a2i.testuser_cost_mod3rem2
        for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');

create table a2i.testdate2019may_august_cost_mod3rem2
    partition of a2i.testuser_cost_mod3rem2
        for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');

create table a2i.testdate2019sept_dec_cost_mod3rem2
    partition of a2i.testuser_cost_mod3rem2
        for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');

The index:

CREATE UNIQUE INDEX awscost_line_item_unique_pkey ON a2i.awscost_line_item (
    account_id, user_id, resource_id, usage_start_date, usage_end_date, usage_type,
    usage_quantity, line_item_type, sku, rate, base_rate, base_cost,
    "cost", currency, product_code, operation
);

For 1st query plan,query is :

  explain analyze select sum(cost) from testawscost_line_item where 
  user_id='1';

2nd query :

 explain (analyze/*, buffers*/) SELECT sum (c.cost),
 sum (case when c.resource_type = 'Compute' then c.cost end) as computeCost,
 sum (case when c.resource_type = 'Storage' then c.cost end) as storageCost,
 sum (case when c.resource_type = 'Network' then c.cost end) as networkCost,
 sum (case when c.resource_type not in ('Compute', 'Network', 'Storage') 
 then c.cost end) as otherCost,
 c.currency,
 c.instance_type as productFamily,
 avg (c.rate) FROM testawscost_line_item c WHERE
 (c.user_id ='1') AND (c.account_id = '807331824280') AND
 (c.usage_start_date >= current_date-30 AND c.usage_end_date <= 
 current_date) AND
 (c.instance_type is not null )
 GROUP BY c.instance_type, c.currency
 ORDER BY 1 desc
4
  • Can you show the queries and the CREATE statements for the table and the indexes? Commented Apr 11, 2019 at 13:19
  • @LaurenzAlbe I added create statements Commented Apr 12, 2019 at 6:19
  • And the queries? Commented Apr 12, 2019 at 7:25
  • @LaurenzAlbe added both queries Commented Apr 12, 2019 at 7:41

1 Answer 1

1

The problem is that your index works well for the first query, but not for the second.

resource_id is in your index, but not in your query, so all index columns after that cannot be used for the query. PostgreSQL decides to use the much smaller primary key index.

The perfect index for this query is:

CREATE INDEX ON a2i.testawscost_line_item (user_id, account_id, usage_start_date)
   WHERE instance_type IS NOT NULL;

I assume that the condition on usage_end_date is not more selective than the one on usage_start_date.

Sign up to request clarification or add additional context in comments.

2 Comments

yes,thanks ,it's working with using your suggested index ...Does order of indexing matters? and guide how to select index?
Um, this is more than I can explain in a comment... essentially, the conditions have to match the first n index columns (without gaps), then this part of the index can be used. So yes, order definitely matters, and your original index is mostly useless.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.