0

Performance is growing increasingly poor. Using explain, I see that there is a sequential scan in a nested loop - which is likely the performance issue. What I do not know is: how do I improve this?

Here is a link to the query and the explain output: https://explain.depesz.com/s/zmzp I'll include them here, too:

Query:

'''
SELECT
    "assets".*
FROM
    "assets"
    INNER JOIN "devices" ON "devices"."asset_id" = "assets"."id"
WHERE
    "assets"."archived_at" IS NULL
    AND "assets"."archive_number" IS NULL
    AND "assets"."assettype_id" = 3
    AND ((assets.lastseendate >= NOW() - INTERVAL '30 days')
        AND ((devices.stop_time IS NULL)
            OR (devices.stop_time >= NOW() - INTERVAL '30 days')
            OR (devices.launch_time IS NOT NULL
                AND devices.launch_time > devices.stop_time)))
'''

And here is the explain output:

Nested Loop  (cost=0.43..255815.01 rows=11889 width=218) (actual time=0.049..2187.719 rows=359445 loops=1)
  Buffers: shared hit=1499737 read=75
  I/O Timings: read=5.382
  ->  Seq Scan on assets  (cost=0.00..117666.24 rows=27484 width=218) (actual time=0.035..770.720 rows=359543 loops=1)
        Filter: ((archived_at IS NULL) AND (archive_number IS NULL) AND (assettype_id = 3) AND (lastseendate >= (now() - 'P30D'::interval)))
        Rows Removed by Filter: 2539219
        Buffers: shared hit=59691
  ->  Index Scan using devices_asset_id_ix on devices  (cost=0.43..5.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=359543)
        Index Cond: (asset_id = assets.id)
        Filter: ((stop_time IS NULL) OR (stop_time >= (now() - 'P30D'::interval)) OR ((launch_time IS NOT NULL) AND (launch_time > stop_time)))
        Rows Removed by Filter: 0
        Buffers: shared hit=1440046 read=75
        I/O Timings: read=5.382
Planning Time: 1.055 ms
Execution Time: 2264.396 ms

The only relevant index is this one:

devices_asset_id_ix

UPDATE: I've added several indexes as listed here:

add_index :devices, [:asset_id, :stop_time, :launch_time], name: "device_online_idx"
add_index :devices, [:asset_id, :stop_time]
add_index :devices, [:asset_id, :launch_time]
add_index :devices, :stop_time
add_index :devices, :launch_time

add_index :assets, [:assettype_id, :archived_at, :archive_number, :lastseendate], name: "asset_unexpired_idx"
add_index :assets, :assettype_id
add_index :assets, :archived_at
add_index :assets, :archive_number
add_index :assets, :lastseendate

This has changed the explain to look like this:

Nested Loop  (cost=0.99..179162.78 rows=11872 width=218) (actual time=0.050..1680.166 rows=359011 loops=1)
  Buffers: shared hit=1726893 read=33
  I/O Timings: read=0.226
  ->  Index Scan using asset_unexpired_idx on assets  (cost=0.56..41125.44 rows=27451 width=218) (actual time=0.037..315.869 rows=359110 loops=1)
        Index Cond: ((assettype_id = 3) AND (archived_at IS NULL) AND (archive_number IS NULL) AND (lastseendate >= (now() - 'P30D'::interval)))
        Buffers: shared hit=288537
  ->  Index Scan using devices_asset_id_ix on devices  (cost=0.43..5.02 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=359110)
        Index Cond: (asset_id = assets.id)
        Filter: ((stop_time IS NULL) OR (stop_time >= (now() - 'P30D'::interval)) OR ((launch_time IS NOT NULL) AND (launch_time > stop_time)))
        Rows Removed by Filter: 0
        Buffers: shared hit=1438356 read=33
        I/O Timings: read=0.226
Planning Time: 1.322 ms
Execution Time: 1757.047 ms

This got a 25% improvement. Is there any way to substantially improve this further?

8
  • 2
    Please edit your question to clearly show the SQL query by itself. Also include all index definitions. Commented Sep 29, 2021 at 15:33
  • 1
    We need EXPLAIN (ANALYZE, BUFFERS) output. Commented Sep 29, 2021 at 15:46
  • I've added the one relevant index definition and changed the explain to have the ANALYZE AND BUFFERS options. NOTE: this is using development (test) data ... not the actual production data. Commented Sep 29, 2021 at 16:44
  • 1
    Your test database should have the same size as production. Otherwise, it is useless for tuning. Commented Sep 29, 2021 at 17:36
  • it looks like I'd benefit from adding indexes to asset.assettype_id, asset.lastseendate, asset.archived_at, and asset.archived_number as well as devices.stop_time and devices.launch_time Commented Sep 29, 2021 at 17:38

1 Answer 1

1

Try a compound b-tree index like so

CREATE INDEX assets_type_archive_date
    ON assets
       (assettype_id, archived_at, archive_number, lastseendate)

It should help you filter your assets table efficiently. The server can random-access the index to the first eligible row, then scan the index sequentially over the range of lastseendate values.

For similar reasons try this index on devices.

CREATE INDEX devices
    ON devices
       (asset_id, stop_time)
Sign up to request clarification or add additional context in comments.

4 Comments

so a compound index would be more efficient than 4 individual indexes on each of those columns? Is it better to do both things?
Try it. It's hard to know without having your data in hand. But this sort of compound index is often the solution to a full table scan (seq scan) slowdown. Lots of single column indexes often don't help a lot.
Updated the explain to use production data.
I added the indexes you recommended - which did provide a small improvement. However, the query is still extremely slow.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.