1

I have a large PostgreSQL database that is effectively read-only, except for very infrequent batch updates. Are there any performance optimizations I can do to make use of this? Can/should I disable visibility check for example?

The largest table:

CREATE TABLE "gene_measurements" (
  "gene" INTEGER NOT NULL REFERENCES "genes" ON DELETE CASCADE,
  "sample" INTEGER NOT NULL REFERENCES "samples" ON DELETE CASCADE,
  "value" REAL NOT NULL
);
CREATE UNIQUE INDEX "gene_measurements_unique_1" ON "gene_measurements" ("sample", "gene") INCLUDE ("value");

A typical query:

SELECT value WHERE gene = 1 AND sample = 2

And the plan:

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using gene_measurements_gene_index on gene_measurements  (cost=0.57..4.59 rows=1 width=4) (actual time=63.621..63.621 rows=0 loops=1)
   Index Cond: ((sample = 2) AND (gene = 1))
   Heap Fetches: 0
 Planning Time: 0.156 ms
 Execution Time: 63.674 ms
(5 rows)
6
  • 1
    Is this having some performance issues? what exactly optimisation problem are you trying to solve? Commented May 11, 2020 at 6:41
  • Yes, I make multiple queries to a large table and I would like them to be faster. Everything is already indexed as far as possible. It's primarily index-only scans that I would like to be faster if possible. Commented May 11, 2020 at 6:51
  • Have you tried another strategies like redis? For more frequent accessed items? Are some other alternatives? Commented May 11, 2020 at 6:54
  • If you have a performance problem, please edit your question and add the query you are having a problem with and the execution plan generated using explain (analyze, buffers, format text) (not just a "simple" explain) as formatted text and make sure you keep the indention of the plan. Paste the text, then put ``` on the line before the plan and on a line after the plan. Please also include complete create index statements for all indexes as well. Commented May 11, 2020 at 7:01
  • 1
    Have you read this post dba.stackexchange.com/questions/42290/… Commented May 11, 2020 at 7:10

2 Answers 2

2

PostgreSQL does not have advanced features to make queries faster if related data is read-only.

You can get potential important performance improvements if the application know how to use this information: application code could try to cache query results for identical queries. You can also create materialized views in the database and modify application code to query materialized views instead of running the queries used to build the materialized views. But in both cases you need to modify application code.

Sign up to request clarification or add additional context in comments.

Comments

0

Depending on the data type you index, some other index types than the default B-tree (log2(N) for search) might trade off insert/delete speed for search (read) speed – if that would help.

Also, since you use a database to feed a machine learning model with training data, consider retrieving multiple rows in a single query (~batch). Depending on the data and its indexing, this might help. Test before production.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.