Postgres and Netdata : apply to autovacuum and pg_stat_bgwriter

20 Feb 2018 3 min read Postgres

Table of Contents

Some time ago, I discovered Netdata. This tool allows to collect many metrics (CPU, network, etc).

The strength of Netdata is to be quite light and easy to use. It collects data every second, everything is stored in memory. There is actually no history. The goal is to have access to a set of metrics in real time. ¹

I added several charts for PostgreSQL, taking a lot of ideas from check_pgactivity.

Here is the presentation and explanations of some charts. Some highlight the behavior of PostgreSQL.

NB: The charts presented here may not correspond to the final version. Several charts have been splited in order to distinguish different operations (reads, writes), and processes (backend, checkpointer, bgwriter). ²

I would like to thank Guillaume Lelarge who reviewed this post :)

Autovacuum

This chart shows the activity of the processes called “autovacuum workers”, launched by the “autovacuum launcher”. Their role is to perform maintenance tasks in order to:

Clean dead rows
Update statistics (see Statistics, cardinality, selectivity)
Freeze old rows to avoid wraparound (See Transaction Id Wraparound in Postgres)

Usually, there is no need to worry, the default configuration is enough in most situations. However, it may happen that it is required to fine tune the configuration of the autovacuum.

For example, if the activity is very sustained and the number of processes launched reaches regularly autovacuum_max_workers. In such a case, it may be advisable:

to increase the number of autovacuum workers (if there are many tables to process at one time)
to make the processes more aggressive (when tables are large). Indeed, their activity is constrained by the parameters autovacuum_vacuum_cost_delay and autovacuum_vacuum_cost_limit so that this task is done in background to minimize the impact on other processes that process queries.

Bgwriter

This chart is called Bgwriter because of the pg_stat_bgwriter view that gives these statistics.

The role of the “bgwriter” is to synchronize “dirty” blocks. These are blocks that have been modified in shared buffers but not yet written to data files (don’t worry, they already are written on disk within a WAL).

The “checkpointer” is also in charge of synchronizing these blocks. Writes are smoothed in background (this depends on the checkpoint_completion_target parameter). The bgwriter occurs when backends need to release blocks in shared buffers, unlike the checkpointer which does it at a checkpoint.

Going back to the chart, the blue curve corresponds to the number of blocks that had to be allocated in shared buffers. The chart has this shape because the screenshot was taken during a pgbench load test while the server had just been started.

We can see quite clearly the effect of the cache which is gradually loading. It is quite visible because the collection is done every second.

We can see this effect of loading the cache on several charts: