Postgres autovacuum keeps transaction id's around to 10% limit, causing aggressive vacuuming to trigger now and then which locks tables

Question

We run multiple Postgres clusters in our infrastructure with 10-30 DB per cluster, ranging from 10 GB size to 1TB in size. I have recently noticed that on all our clusters the wraparound id is always close to the 10% threshold for aggressive auto vacuums. This has caused some instances where heavy active tables got locked to reduce the transaction id's which has some impact on user experience.

Example outputs of age query

   datname   |    age    | current_setting 
-------------+-----------+-----------------
 db1         | 199952474 | 200000000
 db2         | 199808560 | 200000000
 db3         | 199432374 | 200000000
 db4         | 199409271 | 200000000
 db5         | 198777642 | 200000000
 db6         | 198333349 | 200000000
 db7         | 198113424 | 200000000

Query used

SELECT datname
    , age(datfrozenxid)
    , current_setting('autovacuum_freeze_max_age') 
FROM pg_database 
ORDER BY 2 DESC;

We are running Postgres 12.9 clusters. Servers are running at about 25% CPU load. Our vacuum settings

#------------------------------------------------------------------------------
# AUTOVACUUM
#------------------------------------------------------------------------------

autovacuum = on   
autovacuum_work_mem = 1GB                      
#log_autovacuum_min_duration = -1       
autovacuum_max_workers = 20             
autovacuum_naptime = 1min               
autovacuum_vacuum_threshold = 50        
autovacuum_analyze_threshold = 50       
autovacuum_vacuum_scale_factor = 0.02   
autovacuum_analyze_scale_factor = 0.01  
autovacuum_freeze_max_age = 200000000  
#autovacuum_multixact_freeze_max_age = 400000000       
#autovacuum_vacuum_cost_delay = 2ms    
#autovacuum_vacuum_cost_limit = -1

So our schemas remain the same, but the size of our tables may vary significantly depending on the client. I am aware that I can tune auto vacuum per table, but this is not practical in our environment.

So my question is as follow.

Is it expected behaviour for auto vacuum to remain close to my 10% threshold?
Is there some tunning I am missing which would say to Postgres vacuum tables after X amount of time even if nothing has changed.

Appreciate any help you can offer.

"This has caused some instances where heavy active tables got locked to reduce the transaction id's which has some impact on user experience." You would need to delve into this more deeply. The aggressive vacuum can be a pain the butt for maintenance tasks, but should not directly effect end user experience. — jjanes
– jjanes, Commented Feb 5, 2022 at 1:52
@BurakYurdakul our vacuum_cost_limit = 200 so it is basically the default value. — Overklog
– Overklog, Commented Feb 8, 2022 at 6:51
Its low. Increase your vacuum_cost_limit so auto_vacuum has more capacity and doesnt have to kickoff aggresively for freezing rows. If I were you I would start with 2000 and monitor. Also check for bloated tables and indexes. — Burak Yurdakul
– Burak Yurdakul, Commented Feb 8, 2022 at 7:46

Laurenz Albe · Accepted Answer · 2022-02-04 10:09:19Z

2

I don't think you have to worry. As long as you have no tables where relfrozenxid exceeds autovacuum_freeze_max_age by a lot, everything is running as it should.

If you have insert-only tables, upgrading to v13 or better would help. From that version on, such tables will receive autovacuum runs earlier, which reduces the size of an anti-wraparound vacuum run.

Neither a normal VACUUM not an anti-wraparound vacuum will lock the table for INSERT, UPDATE or DELETE.

answered Feb 4, 2022 at 10:09

Laurenz Albe

62.7k4 gold badges58 silver badges94 bronze badges

The problem is that once it goes over 200M txids the vacuums become more aggressive and it has a user impact. The machines are idle through the night, can auto vacuum be set more aggressive so as to clean all tables BEFORE we reach 200m? Or should this be dealt with manually over night? Maybe it is not locking for inserts, but we do see queries not completing, could be CPU use related, but it has an impact on our clusters that looks like downtime.

Kobus
– Kobus

2022-02-07 12:30:46 +00:00
Commented Feb 7, 2022 at 12:30
@Kobus, sure you could set autovacuum_freeze_max_age to a lower value at night (and reload the config files) and then raise it again in the morning.

jjanes
– jjanes

2022-02-07 14:30:24 +00:00
Commented Feb 7, 2022 at 14:30
@Kobus correction, it requires a full restart not a reload, which does make it less attractive.

jjanes
– jjanes

2022-02-07 15:04:04 +00:00
Commented Feb 7, 2022 at 15:04
You could lower autovacuum_freeze_max_age for good, then the anti-wraparound vacuum runs will come earlier and hopefully have less to do. You cannot avoid doing the work, but doing it earlier and consequently in smaller batches might help.

Laurenz Albe
– Laurenz Albe

2022-02-07 16:39:36 +00:00
Commented Feb 7, 2022 at 16:39

Add a comment |

Stack Exchange Network

Postgres autovacuum keeps transaction id's around to 10% limit, causing aggressive vacuuming to trigger now and then which locks tables

1 Answer 1

Your Answer

Hot Network Questions

Postgres autovacuum keeps transaction id's around to 10% limit, causing aggressive vacuuming to trigger now and then which locks tables

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions