0

I want to index my tables for the following query:

select 
    t.*
from main_transaction t 
left join main_profile profile on profile.id = t.profile_id
left join main_customer customer on (customer.id = profile.user_id) 
where
(upper(t.request_no) like upper(('%'||@requestNumber||'%')) or OR upper(c.phone) LIKE upper(concat('%',||@phoneNumber||,'%')))
 and t.service_type = 'SERVICE_1'
 and t.status  = 'SUCCESS'
 and t.mode = 'AUTO'
 and t.transaction_type = 'WITHDRAW'
 and customer.client = 'corp'
 and t.pub_date>='2018-09-05' and t.pub_date<='2018-11-05'
order by t.pub_date desc, t.id asc 
LIMIT 1000;

This is how I tried to index my tables:

CREATE INDEX main_transaction_pr_id ON main_transaction (profile_id);
CREATE INDEX main_profile_user_id ON main_profile (user_id);
CREATE INDEX main_customer_client ON main_customer (client);
CREATE INDEX main_transaction_gin_req_no ON main_transaction USING gin (upper(request_no) gin_trgm_ops);
CREATE INDEX main_customer_gin_phone ON main_customer USING gin (upper(phone) gin_trgm_ops);
CREATE INDEX main_transaction_general ON main_transaction (service_type, status, mode, transaction_type); --> don't know if this one is true!!

After indexing like above my query is spending over 4.5 seconds for just selecting 1000 rows!

I am selecting from the following table which has 34 columns including 3 FOREIGN KEYs and it has over 3 million data rows:

CREATE TABLE main_transaction (
   id integer NOT NULL DEFAULT nextval('main_transaction_id_seq'::regclass),
   description character varying(255) NOT NULL,
   request_no character varying(18),
   account character varying(50),
   service_type character varying(50),
   pub_date" timestamptz(6) NOT NULL,
   "service_id" varchar(50) COLLATE "pg_catalog"."default",
   ....
 );

I am also joining two tables (main_profile, main_customer) for searching customer.phone and for selecting customer.client. To get to the main_customer table from main_transaction table, I can only go by main_profile

My question is how can I index my table too increase performance for above query?

Please, do not use UNION for OR for this case (upper(t.request_no) like upper(('%'||@requestNumber||'%')) or OR upper(c.phone) LIKE upper(concat('%',||@phoneNumber||,'%'))) instead can we use case when condition? Because, I have to convert my PostgreSQL query into Hibernate JPA! And I don't know how to convert UNION except Hibernate - Native SQL which I am not allowed to use.

Explain:

Limit  (cost=411601.73..411601.82 rows=38 width=1906) (actual time=3885.380..3885.381 rows=1 loops=1)
  ->  Sort  (cost=411601.73..411601.82 rows=38 width=1906) (actual time=3885.380..3885.380 rows=1 loops=1)
        Sort Key: t.pub_date DESC, t.id
        Sort Method: quicksort  Memory: 27kB
        ->  Hash Join  (cost=20817.10..411600.73 rows=38 width=1906) (actual time=3214.473..3885.369 rows=1 loops=1)
              Hash Cond: (t.profile_id = profile.id)
              Join Filter: ((upper((t.request_no)::text) ~~ '%20181104-2158-2723948%'::text) OR (upper((customer.phone)::text) ~~ '%20181104-2158-2723948%'::text))
              Rows Removed by Join Filter: 593118
              ->  Seq Scan on main_transaction t  (cost=0.00..288212.28 rows=205572 width=1906) (actual time=0.068..1527.677 rows=593119 loops=1)
                    Filter: ((pub_date >= '2016-09-05 00:00:00+05'::timestamp with time zone) AND (pub_date <= '2018-11-05 00:00:00+05'::timestamp with time zone) AND ((service_type)::text = 'SERVICE_1'::text) AND ((status)::text = 'SUCCESS'::text) AND ((mode)::text = 'AUTO'::text) AND ((transaction_type)::text = 'WITHDRAW'::text))
                    Rows Removed by Filter: 2132732
              ->  Hash  (cost=17670.80..17670.80 rows=180984 width=16) (actual time=211.211..211.211 rows=181516 loops=1)
                    Buckets: 131072  Batches: 4  Memory Usage: 3166kB
                    ->  Hash Join  (cost=6936.09..17670.80 rows=180984 width=16) (actual time=46.846..183.689 rows=181516 loops=1)
                          Hash Cond: (customer.id = profile.user_id)
                          ->  Seq Scan on main_customer customer  (cost=0.00..5699.73 rows=181106 width=16) (actual time=0.013..40.866 rows=181618 loops=1)
                                Filter: ((client)::text = 'corp'::text)
                                Rows Removed by Filter: 16920
                          ->  Hash  (cost=3680.04..3680.04 rows=198404 width=8) (actual time=46.087..46.087 rows=198404 loops=1)
                                Buckets: 131072  Batches: 4  Memory Usage: 2966kB
                                ->  Seq Scan on main_profile profile  (cost=0.00..3680.04 rows=198404 width=8) (actual time=0.008..20.099 rows=198404 loops=1)
Planning time: 0.757 ms
Execution time: 3885.680 ms
2
  • Because you are using * and the postgresql choose seq scan, Just take what column do you need. And see again plan execution what postgresql do.. Commented Nov 12, 2018 at 1:42
  • @LaurenzAlbe. Can you please advice? Commented Nov 12, 2018 at 3:58

1 Answer 1

1

With the restriction to not use UNION, you won't get a good plan.

You can slightly speed up processing with the following indexes:

main_transaction ((service_type::text), (status::text), (mode::text),
                  (transaction_type::text), pub_date)
main_customer ((client::text))

These should at least get rid of the sequential scans, but the hash join that takes the lion's share of the processing time will remain.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.