Handling Out of Shared Memory Error in PostgreSQL (container) with 80K Sub Partitioned Tables

Question

I have PostgreSql 15.3 running as a docker container. My docker run configuration is -m 512g --memory-swap 512g --shm-size=16g

Using this configuration, I loaded 36B rows, taking up about 30T between tables and indexes. I loaded with about 24 parallel operations, and the pg driver was using connection pooling, so there were about 150 concurrent connections but not all active. My postgresql.conf was set to

max_connections = 512 
max_locks_per_transaction = 1024

All of this ran without any problem.

The 36B row table is sub-partitioned. The layer with actual data has about 80,000 tables. Parent, partitioned by category, each partitioned by year, each partitioned by month, each partitioned by sensor id, which is the 80K tables of data.

My problem is that for the life of me, I cannot perform a simple COUNT(*) on the top level table without the out of shared memory, you might need to increase max_locks_per_connection. In fact, I can count from a single table and from the month partition, but not from the year partition, which is only about 10K tables.

Reading that I need the lock table to hold max_connection * max_locks in shared memory, I've tried lowering the max_connections, and increasing both max_locks and shared memory but without success. Currently I am at -m 512g --memory-swap 512g --shm-size=64g for the container and

max_connections = 8 
max_locks_per_transaction = 163840

in the config, having gone by increments of course.

I do have a second table partitioned the same way, much smaller volume of data, but also 80K tables with data. From what I've read, something like

max_connections = 100 
max_locks_per_transaction = 1600

should cover 160K tables, and if each lock takes 168 bytes, the shared memory needs only at least 26M.

Any suggestions on what I should be modifying and how I should be calculating the target values?

Why so many partitions? Looks like you have an average of just 450.000 records per partitions, that's next to nothing. When aiming for a 500 million records per table, you would only need 70+ partitions. I would start removing the month-partitions, that would already reduce the number of partitions by a factor 12. — Frank Heikens
– Frank Heikens, Commented Jan 21, 2024 at 21:21
@FrankHeikens If ultimately there is a limit to the number of partitions Postgres can allow, then yes, I would work around it, but I've yet to see such a limit documented. The data is not evenly distributed and many sensors will have 2-4 milion rows. However, the main reason is that the data is continually updated at the sensor level, so DROP/ATTACH PARTITION at that level minimizes overall locking during updates, a strategy recommended in the docs at 5.11.2.2. Partition Maintenance — mdisibio
– mdisibio, Commented Jan 21, 2024 at 21:45
Have you verified the max_* config settings are actually in use? How did you got about setting these on an already existing database in a docker container? — jjanes
– jjanes, Commented Jan 21, 2024 at 22:47
@jjanes One simply mounts a postgresql.conf file to the container and the configuration is read whenever the container is started. The settings can be edited just like an on-host instance. — mdisibio
– mdisibio, Commented Jan 21, 2024 at 23:15
I would first validate the entire strategy. For me, partitioning for a mere 450.000 records doesn't make sense, that's too small. Instead of months, you could stick with a year and the average partition will be a 5.4 million records. Still a small number of records, but you would end up with "just" 6700 tables instead of 80.000 Another issue would be autovacuum, vacuum on 80.000 tables is very different than just 6700 tables. — Frank Heikens
– Frank Heikens, Commented Jan 21, 2024 at 23:31

jjanes · Accepted Answer · 2024-01-22 15:41:15Z

2

Having a ridiculous number of partitions is going to require a ridiculous number of locks. The number of max_connections * max_locks_per_connection you show should be able to handle 80000 tables in a simple case of unindexed tables with flat partitions, but indexes also need to be locked during planning even if they don't end up being used in the execution. The deep partitioning structure will also require some more locks, but without a test script to reproduce your exact structure I haven't tested to see how many it would need.

Handling partitions gets more and more unwieldy the more partitions you have. That is going to put a practical limit on the number of partitions you can have long before you reach a hard limit. I would say you are already over the practical limit.

These problems are just internal to PostgreSQL. If you were hitting memory problems at the OS/VM level, I would expect those to fail with different error messages than the one you see.

The locking issues should be the same whether the tables are populated or just exist but empty, so if you need to test things it should be easy to do just be exporting the structure without the data (like pg_dump -s).

answered Jan 22, 2024 at 15:41

jjanes

44.9k5 gold badges39 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mdisibio Over a year ago

Okay, "indexes also need to be locked even if not used in the execution". That explains why doing a COUNT(*) on one lowest level partition required 11 locks, and at the month level, with 700 tables, was taking out 20,000 locks, and why this is now unmanageable for Postgres at the top partition level. Yes, the partitions are moderately indexed, including the sub-partition table definitions, to match the shaping and analytics needs. Was never a problem and in fact necessary with the non-partitioned version.

mdisibio · Accepted Answer · 2024-01-25 21:28:02Z

The final answer is indeed "you have too many partitions".

HOWEVER, the actual solution to my specific question lies in the behavior of the PostgreSql docker image and specifically it seems to ignore max_locks_per_transaction if modifed in a config file attached to the container. It needs to be passed into the docker run command line as a -c option instead.

Thanks @jjanes for suggesting verifying the values were read.

When you run a postgres container, you can attach your own config file:

$ # run postgres with custom config
$ docker run -d --name some-postgres 
             -v "$PWD/my-postgres.conf":/etc/postgresql/postgresql.conf 
             postgres -c 'config_file=/etc/postgresql/postgresql.conf'

Of course, you usually need to stop and restart the container to modify the configuration setting. If you modify the config file with

max_connections = 333 (some non-default number),

start the container, and issue

SELECT name, setting FROM pg_settings WHERE name = 'max_connections';
      name       | setting
-----------------+---------
 max_connections | 333

Now do the same with max_locks_per_transaction and for the postgres:15.3 image no matter what you set it to you get

      name                 | setting
---------------------------+---------
 max_locks_per_transaction | 256

This is hinted at in Cant change max_locks in docker and at Change max_locks for github actions

The configuration seems to be respected only if passed in on the docker run command line

$ docker run -d --name some-postgres 
             -v "$PWD/my-postgres.conf":/etc/postgresql/postgresql.conf 
             postgres -c 'config_file=/etc/postgresql/postgresql.conf' 
                      -c max_locks_per_transaction=1024

Once I did this, and set max_locks_per_transaction to a ludicrous number (500,000), I was able to issue a count against the next higher partition level.

That said, the second solution to my question lies in the detail that most discussions around this very issue seem to not mention. max_locks_per_transaction * max_connections needs to cover not just the number of all partitioned tables, but also each and every index on them. Thanks @jjanes again.

In my case, each table has about 10 indexes. A query against one table needed 11 locks. Against the month partition needed 21,000 locks, and against the year partition needed 218,000 locks. Clearly unsustainable.

The sobering fact is that PostgreSql partitioning does not scale horizontally as comfortably as many other features and may not lend itself well to matching the logical organization of your business domain.

Update:

Happy outcome to the story. I collapsed my sensor partitions up to the month level. For category/year/month I ended up with 265 partitions (2/10/12). With the same indexes in place, could query

     month   49 locks
      year  382 locks
full table 5128 locks

and executed with a 4x improvement in speed and parallelization over the non-partitioned table

Collectives™ on Stack Overflow

Handling Out of Shared Memory Error in PostgreSQL (container) with 80K Sub Partitioned Tables

2 Answers 2

1 Comment

Update:

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Update:

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related