1

Is it possible to optimize the following query? webdte.docto a is very large table with millions of entries and runs indexes on all queried columns. The final sort order is quite important.

SELECT 
   id_doc,
   id_tip_doc,
   id_est_doc,
   folios.nro_fol,
   seleccionable
FROM
(
   SELECT distinct(nro_fol)
   FROM webdte.docto 
   WHERE
      id_tip_doc IN
      (
         SELECT distinct(id_tip_doc)
         FROM webdte.docto
         WHERE id_doc IN
         (
            SELECT id_doc
            FROM webdte.lib_doc
            WHERE id_lib = 37
         )
      ) AND
      id_doc IN
      (
         SELECT id_doc
         FROM webdte.lib_doc
         WHERE id_lib = 37
      )
) AS folios JOIN webdte.docto AS docs ON docs.nro_fol = folios.nro_fol
ORDER BY id_tip_doc, folios.nro_fol, id_est_doc;

Sorry here is the explain for my fist query approach. the answer from Egalitarian is already good, but maybe it can be still faster?? Thank you!

Sort  (cost=13745.13..13805.42 rows=24115 width=22)"
  Sort Key: docs.id_tip_doc, docto.nro_fol, docs.id_est_doc"
  ->  Hash Join  (cost=9240.19..11492.84 rows=24115 width=22)"
        Hash Cond: (docto.nro_fol = docs.nro_fol)"
        ->  HashAggregate  (cost=4424.81..4665.91 rows=24110 width=6)"
              ->  Hash Semi Join  (cost=733.75..4364.54 rows=24110 width=6)"
                    Hash Cond: (docto.id_doc = lib_doc.id_doc)"
                    ->  Seq Scan on docto  (cost=0.00..2885.28 rows=105128 width=10)"
                    ->  Hash  (cost=432.38..432.38 rows=24110 width=4)"
                          ->  Seq Scan on lib_doc  (cost=0.00..432.38 rows=24110 width=4)"
                                Filter: (id_lib = 37)"
        ->  Hash  (cost=2885.28..2885.28 rows=105128 width=22)"
              ->  Seq Scan on docto docs  (cost=0.00..2885.28 rows=105128 width=22)"
1
  • 1
    Could you show us the results from EXPLAIN and EXPLAIN ANALYZE? Without this information, it's next to impossible to optimize the query because you can't see where the actual problems are. Only guess... Commented Jul 23, 2012 at 9:51

2 Answers 2

1

I think you can simplify to:

SELECT id_doc
      ,id_tip_doc
      ,id_est_doc
      ,nro_fol
      ,seleccionable
FROM   webdte.docto d
WHERE  EXISTS (
   SELECT 1
   FROM   webdte.docto   d0
   JOIN   webdte.lib_doc l USING (id_doc)
   WHERE  l.id_lib = 37
   AND    d0.nro_fol = d.nro_fol
   )
ORDER  BY id_tip_doc, nro_fol, id_est_doc;

Because of EXISTS, DISTINCT should not be needed. This can speed up the query quite a bit if there are many duplicates on nro_fol.
Your original query was quite redundant.

Sign up to request clarification or add additional context in comments.

Comments

0

I think the where clause to fetch unique id_tip_doc is not of much significance as you are anyways selecting distinct(nro_fol). Though one of the best ways to optimize this query would be to use the proper indexes and then re-write the query.

You can create the following indexes(Though it also depends on your other queries) : 1. webdte.lib_doc : id_lib 2. webdte.docto : id_doc + nro_fol

select id_doc,id_tip_doc,id_est_doc,  folios.nro_fol ,seleccionable 

from (select distinct(nro_fol) from webdte.docto where id_doc in (select id_doc from webdte.lib_doc where id_lib = 37) ) folios
join webdte.docto docs on docs.nro_fol = folios.nro_fol order by id_tip_doc, folios.nro_fol, id_est_doc;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.