how to reduce amount of WAL files generated in postgresql

Question

Huge pile of WAL files are generated in Master-Standby replication. walfiles are archived at one of the standby node and every 2 hour, we are using tar to compress the archived WALs in standby node. Still, it becomes a huge size to store. When it comes to 30, 90 days' backup it becomes a huge storage issue. Also, ends up taking more time to download and replay the WAL's during restoration.

I have used the below options.

wal_level=replica
wal_compression=on
archive_mode = always

And below parameters are commented/not used.

archive_timeout
checkpoint_timeout

Is there any other way, we can reduce the number of WAL's generated or an easier way to manage them? pg_waldump is showing around 70-90% of the data is full page images.

Also, Can I make above parameters in effect by changing in standby node? Is standby archiving the same WAL's sent by the master? OR is it regenerating based on standby's configuration?

-- Update: Modified to below values

        name        | setting | unit
--------------------+---------+------
 archive_timeout    | 0       | s
 checkpoint_timeout | 3600    | s
 checkpoint_warning | 3600    | s
 max_wal_size       | 4000    | MB
 min_wal_size       | 2000    | MB
 shared_buffers     | 458752  | 8kB
 wal_buffers        | 4096    | 8kB
 wal_compression    | on      |
 wal_level          | replica |

still seeing 3-4 WAL files generated every minute. I am making these changes on hot standby node(From where backup is taken). Should I change this in Master? Does master settings have affect on Standby's WAL generation?

Example pg_waldump showing FPI size=87%

pg_waldump --stats 0000000100000498000000B2
Type                                           N      (%)          Record size      (%)             FPI size      (%)        Combined size      (%)
----                                           -      ---          -----------      ---             --------      ---        -------------      ---
XLOG                                           1 (  0.00)                  114 (  0.01)                    0 (  0.00)                  114 (  0.00)
Transaction                                 3070 ( 10.35)               104380 (  4.86)                    0 (  0.00)               104380 (  0.63)
Storage                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CLOG                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Database                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Tablespace                                     0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
MultiXact                                      0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
RelMap                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Standby                                        2 (  0.01)                  100 (  0.00)                    0 (  0.00)                  100 (  0.00)
Heap2                                        590 (  1.99)                33863 (  1.58)                46192 (  0.32)                80055 (  0.48)
Heap                                        6679 ( 22.51)               578232 ( 26.92)              4482508 ( 30.92)              5060740 ( 30.41)
Btree                                      19330 ( 65.14)              1430918 ( 66.62)              9967524 ( 68.76)             11398442 ( 68.48)
Hash                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Gin                                            0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Gist                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Sequence                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
SPGist                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
BRIN                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CommitTs                                       4 (  0.01)                  120 (  0.01)                    0 (  0.00)                  120 (  0.00)
ReplicationOrigin                              0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Generic                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
LogicalMessage                                 0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
                                        --------                      --------                      --------                      --------
Total                                      29676                       2147727 [12.90%]             14496224 [87.10%]             16643951 [100%]

After using log_checkpoints=on

2022-06-15 07:08:57 UTC [11] LOG:  checkpoint starting: time
2022-06-15 07:29:57 UTC [11] LOG:  checkpoint complete: wrote 67010 buffers (14.6%); 0 WAL file(s) added, 12 removed, 56 recycled; write=1259.767 s, sync=0.010 s, total=1259.961 s; sync files=253, longest=0.003 s, average=0.001 s; distance=1125728 kB, estimate=2176006 kB
2022-06-15 07:38:57 UTC [11] LOG:  checkpoint starting: time
2022-06-15 07:59:57 UTC [11] LOG:  checkpoint complete: wrote 61886 buffers (13.5%); 0 WAL file(s) added, 20 removed, 10 recycled; write=1259.740 s, sync=0.005 s, total=1259.878 s; sync files=185, longest=0.002 s, average=0.001 s; distance=491822 kB, estimate=2007588 kB

Many data changes lead to much WAL, that's life. You can increase max_wal_size and checkpoint_timeout to reduce the number of checkpoints and full page images in the WAL, which will reduce the amount of WAL somewhat at the price of longer crash recovery. — Laurenz Albe
– Laurenz Albe, Commented Jun 10, 2022 at 1:13
@LaurenzAlbe checkpoint_timeout not set. based on the number of WALs, I think none of the WALs are empty. none of them are generated because a checkpoint was reached. by the way I reached here cybertec-postgresql.com/en/… and enabled wal_compression=on. I am already using tar to keep them compresses. Need to see the difference. Thank you ! — Anto
– Anto, Commented Jun 10, 2022 at 5:27
A checkpoint does not cause a WAL switch. The intention of my suggestion is to get fewer full 8kB page images in the WAL. The first time a page is dirtied after a checkpoint, the whole page it written to WAL. — Laurenz Albe
– Laurenz Albe, Commented Jun 10, 2022 at 5:33
@LaurenzAlbe Got it. Is there any thumb rule or any rule to set a decent value for checkpoint_timeout? pg_waldump showing around 70-90 % of data is FPI. — Anto
– Anto, Commented Jun 10, 2022 at 6:01

jjanes · Accepted Answer · 2022-06-10 04:13:34Z

4

wal_compression=on

This may be counter-productive. This type of compression needs to compress each WAL record in isolation, without the larger context. So this is not very effective. However, when you then recompress whole WAL files offline where they do have access to the larger context, the first round of attempted compression interferes with the better-situated compression attempt.

For example, if I take the WAL from 1,000,000 pgbench transactions, they occupy 889192448 raw bytes without wal_compression, and 637534208 with it.

But then after passing them through 'xz' (a very slow but very thorough compressor), the first set takes 129393020 bytes but the 2nd one takes 155769400. So turning on compression too soon cost me 20% more space.

You could use pg_waldump --stat ... on some WAL files to see what is actually in them. If it is mostly FPI, then you could try to make the checkpoints further apart to reduce the FPI frequency. But if you don't have much FPI to start with, that would be ineffective. If you can isolate what is causing so much WAL maybe you can do something about it. For example if you do a lot of degenerate updates where a column is set to the same value it already had, adding a WHERE to suppress those cases could spare you a lot of WAL generation.

answered Jun 10, 2022 at 4:13

jjanes

44.9k5 gold badges39 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Anto Over a year ago

Thanks for pointing to pg_waldump. Nice tool. As per pg_waldump, FPI size on each WAL is around 70%-90%. Does this mean, checkpoints should be further apart? unnecessarily WALs are generated before enough data is generated on DB?

Joshua D. Drake · Accepted Answer · 2022-06-09 19:41:32Z

1

WALs being generated are a reflection of your primary machine activity. Increasing checkpoint_timeout will help reduce your overall machine activity making it easier to process the WAL logs.

Standby Archiving is the processing the logs as sent by the Primary. They are binary identical. Is it a cold standby or are you processing logs on the standby as they are sent?

answered Jun 9, 2022 at 19:41

Joshua D. Drake

1,0125 silver badges6 bronze badges

2 Comments

Anto Over a year ago

It's a hot standby. As soon as any changes appear in primary, it is available on standby as well. So the archived logs I am getting is newly created by standby OR the same shipped by primary?

Joshua D. Drake Over a year ago

They are the same shipped by the primary.

Laurenz Albe · Accepted Answer · 2022-06-10 08:59:20Z

1

Since a high percentage of your WAL consists of full page images, you can reduce the amount of WAL considerably by having checkpoints less often. A full page image is written to WAL whenever a page becomes dirty for the first time after a checkpoint. The price you have to pay is a longer crash recovery time.

To reduce the rate of checkpoints, change these parameters:

checkpoint_timeout (default 5 minutes): set it to something high like 1 hour
max_wal_size (default 1GB): set it higher than the amount of WAL that is written withing one hour to match the checkpoint_timeout setting

These settings have to be made on the primary server, where WAL is generated, not on the standby. Best practice is to use the same settings on both servers.

edited Jun 10, 2022 at 8:59

answered Jun 10, 2022 at 6:08

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

6 Comments

Anto Over a year ago

I configured checkpoint_timeout=3600 and max_wal_size=4G. Restarted docker running psql. Still I see a multiple WAL files are generated every minute. 3-4 files of 16MB are created in a minute. Isn't this abnormal? Also, I used below command pg_waldump --stats 0000000100000385000000EF and got FPI as 70-90% Should I specify LSN instead?

Anto Over a year ago

sorry, my bad. .conf had parameter set as 4GB. But terminal showing ``` name | max_wal_size setting | 4096 unit | MB ```

Anto Over a year ago

I have update the question with more details after the changes. Still 3-4 WAL files in every minute.

Laurenz Albe Over a year ago

You have to change it on the primary. Try with max_wal_size = 10GB to be on the safe side. Use log_checkpoints = on to see how often you get a checkpoint. The number of full page images should decrease over time.

Anto Over a year ago

I have increased values to max_wal_size = 8GB and checkpoint_timeout=1800. Still I am seeing multiple walfiles in a minute. And example WALfile showing FPI around 80-% I used below command: ` pg_waldump --stats 00000001000003BE000000CB ` And got ` FPI size=14070852 [84.65%]`.

|

Collectives™ on Stack Overflow

how to reduce amount of WAL files generated in postgresql

3 Answers 3

1 Comment

2 Comments

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

2 Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related