Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/postgresql?at=release/6.4
PostgreSQL by Zabbix agent
Overview
This template is designed for the effortless deployment of PostgreSQL monitoring by Zabbix via Zabbix agent and doesn't require any external scripts.
Requirements
Zabbix version: 6.4 and higher.
Tested versions
This template has been tested on:
- PostgreSQL 10-15
Configuration
Zabbix should be configured according to instructions in the Templates out of the box section.
Setup
-
Install Zabbix agent and create a read-only
zbx_monitor
user with proper access to your PostgreSQL server.For PostgreSQL version 10 and above:
CREATE USER zbx_monitor WITH PASSWORD '<PASSWORD>' INHERIT; GRANT pg_monitor TO zbx_monitor;
For PostgreSQL version 9.6 and below:
CREATE USER zbx_monitor WITH PASSWORD '<PASSWORD>'; GRANT SELECT ON pg_stat_database TO zbx_monitor; -- To collect WAL metrics, the user must have a `superuser` role. ALTER USER zbx_monitor WITH SUPERUSER;
-
Copy
postgresql/
to Zabbix agent home directory/var/lib/zabbix/
. Thepostgresql/
directory contains the files needed to obtain metrics from PostgreSQL. -
Copy
template_db_postgresql.conf
to Zabbix agent configuration directory/etc/zabbix/zabbix_agentd.d/
and restart Zabbix agent service. -
Edit
pg_hba.conf
to allow connections from Zabbix agent https://www.postgresql.org/docs/current/auth-pg-hba-conf.html.Add rows (for example):
host all zbx_monitor 127.0.0.1/32 trust host all zbx_monitor 0.0.0.0/0 md5 host all zbx_monitor ::0/0 md5
-
Import template file to Zabbix and link it to the target host
-
Set {$PG.HOST}, {$PG.PORT}, {$PG.USER}, {$PG.PASSWORD} and {$PG.DB} macros values.
Macros used
Name | Description | Default |
---|---|---|
{$PG.CACHE_HITRATIO.MIN.WARN} | 90 |
|
{$PG.CHECKPOINTS_REQ.MAX.WARN} | 5 |
|
{$PG.CONFLICTS.MAX.WARN} | 0 |
|
{$PG.CONN_IDLE_IN_TRANS.MAX.WARN} | 5 |
|
{$PG.CONN_TOTAL_PCT.MAX.WARN} | 90 |
|
{$PG.CONN_WAIT.MAX.WARN} | 0 |
|
{$PG.DB} | postgres |
|
{$PG.DEADLOCKS.MAX.WARN} | 0 |
|
{$PG.FROZENXID_PCT_STOP.MIN.HIGH} | 75 |
|
{$PG.HOST} | 127.0.0.1 |
|
{$PG.LLD.FILTER.DBNAME} | (.*) |
|
{$PG.LOCKS.MAX.WARN} | 100 |
|
{$PG.PING_TIME.MAX.WARN} | 1s |
|
{$PG.PORT} | 5432 |
|
{$PG.QUERY_ETIME.MAX.WARN} | 30 |
|
{$PG.REPL_LAG.MAX.WARN} | 10m |
|
{$PG.SLOW_QUERIES.MAX.WARN} | 5 |
|
{$PG.TRANS_ACTIVE.MAX.WARN} | 30s |
|
{$PG.TRANS_IDLE.MAX.WARN} | 30s |
|
{$PG.TRANS_WAIT.MAX.WARN} | 30s |
|
{$PG.USER} | zbx_monitor |
|
{$PG.PASSWORD} | Please set user's password in this macro. |
Items
Name | Description | Type | Key and additional info |
---|---|---|---|
Bgwriter: Buffers allocated per second | Number of buffers allocated |
Dependent item | pgsql.bgwriter.buffers_alloc.rate Preprocessing
|
Bgwriter: Buffers written directly by a backend per second | Number of buffers written directly by a backend |
Dependent item | pgsql.bgwriter.buffers_backend.rate Preprocessing
|
Bgwriter: Buffers backend fsync per second | Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write) |
Dependent item | pgsql.bgwriter.buffers_backend_fsync.rate Preprocessing
|
Bgwriter: Buffers written during checkpoints per second | Number of buffers written during checkpoints |
Dependent item | pgsql.bgwriter.buffers_checkpoint.rate Preprocessing
|
Bgwriter: Buffers written by the background writer per second | Number of buffers written by the background writer |
Dependent item | pgsql.bgwriter.buffers_clean.rate Preprocessing
|
Bgwriter: Requested checkpoints per second | Number of requested checkpoints that have been performed |
Dependent item | pgsql.bgwriter.checkpoints_req.rate Preprocessing
|
Bgwriter: Scheduled checkpoints per second | Number of scheduled checkpoints that have been performed |
Dependent item | pgsql.bgwriter.checkpoints_timed.rate Preprocessing
|
Bgwriter: Checkpoint sync time | Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk |
Dependent item | pgsql.bgwriter.checkpoint_sync_time Preprocessing
|
Bgwriter: Checkpoint write time | Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds |
Dependent item | pgsql.bgwriter.checkpoint_write_time Preprocessing
|
Bgwriter: Max written per second | Number of times the background writer stopped a cleaning scan because it had written too many buffers |
Dependent item | pgsql.bgwriter.maxwritten_clean.rate Preprocessing
|
PostgreSQL: Get bgwriter | Statistics about the background writer process's activity |
Zabbix agent | pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Status: Cache hit ratio % | Cache hit ratio |
Zabbix agent | pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Status: Config hash | PostgreSQL configuration hash |
Zabbix agent | pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Preprocessing
|
Connections sum: Active | Total number of connections executing a query |
Dependent item | pgsql.connections.sum.active Preprocessing
|
Connections sum: Idle | Total number of connections waiting for a new client command |
Dependent item | pgsql.connections.sum.idle Preprocessing
|
Connections sum: Idle in transaction | Total number of connections in a transaction state, but not executing a query |
Dependent item | pgsql.connections.sum.idle_in_transaction Preprocessing
|
Connections sum: Prepared | Total number of prepared transactions https://www.postgresql.org/docs/current/sql-prepare-transaction.html |
Dependent item | pgsql.connections.sum.prepared Preprocessing
|
Connections sum: Total | Total number of connections |
Dependent item | pgsql.connections.sum.total Preprocessing
|
Connections sum: Total % | Total number of connections in percentage |
Dependent item | pgsql.connections.sum.total_pct Preprocessing
|
Connections sum: Waiting | Total number of waiting connections https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-TABLE |
Dependent item | pgsql.connections.sum.waiting Preprocessing
|
PostgreSQL: Get connections sum | Collect all metrics from pg_stat_activity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW |
Zabbix agent | pgsql.connections.sum["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
PostgreSQL: Get dbstat | Collect all metrics from pg_stat_database per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW |
Zabbix agent | pgsql.dbstat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
PostgreSQL: Get locks | Collect all metrics from pg_locks per database https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-TABLES |
Zabbix agent | pgsql.locks["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Status: Ping time | Zabbix agent | pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Preprocessing
|
|
Status: Ping | Zabbix agent | pgsql.ping["{$PG.HOST}","{$PG.PORT}"] Preprocessing
|
|
PostgreSQL: Get queries | Collect all metrics by query execution time |
Zabbix agent | pgsql.queries["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{$PG.QUERY_ETIME.MAX.WARN}"] |
Replication: standby count | Number of standby servers |
Zabbix agent | pgsql.replication.count["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Replication: lag in seconds | Replication lag with Master in seconds |
Zabbix agent | pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Replication: recovery role | Replication role: 1 — recovery is still in progress (standby mode), 0 — master mode. |
Zabbix agent | pgsql.replication.recovery_role["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Replication: status | Replication status: 0 — streaming is down, 1 — streaming is up, 2 — master mode |
Zabbix agent | pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Transactions: Max active transaction time | Current max active transaction time |
Dependent item | pgsql.transactions.active Preprocessing
|
Transactions: Max idle transaction time | Current max idle transaction time |
Dependent item | pgsql.transactions.idle Preprocessing
|
Transactions: Max prepared transaction time | Current max prepared transaction time |
Dependent item | pgsql.transactions.prepared Preprocessing
|
Transactions: Max waiting transaction time | Current max waiting transaction time |
Dependent item | pgsql.transactions.waiting Preprocessing
|
PostgreSQL: Get transactions | Collect metrics by transaction execution time |
Zabbix agent | pgsql.transactions["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Status: Uptime | Zabbix agent | pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] | |
Status: Version | PostgreSQL version |
Zabbix agent | pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] Preprocessing
|
WAL: Segments count | Number of WAL segments |
Dependent item | pgsql.wal.count Preprocessing
|
PostgreSQL: Get WAL | Master item to collect WAL metrics |
Zabbix agent | pgsql.wal.stat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
WAL: Bytes written | WAL write in bytes |
Dependent item | pgsql.wal.write Preprocessing
|
Triggers
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
PostgreSQL: Required checkpoints occurs too frequently | Checkpoints are points in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before that checkpoint. At checkpoint time, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file.https://www.postgresql.org/docs/current/wal-configuration.html |
last(/PostgreSQL by Zabbix agent/pgsql.bgwriter.checkpoints_req.rate) > {$PG.CHECKPOINTS_REQ.MAX.WARN} |
Average | |
PostgreSQL: Failed to get items | Zabbix has not received any data for items for the last 30 minutes. |
nodata(/PostgreSQL by Zabbix agent/pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],30m) = 1 |
Warning | Depends on:
|
PostgreSQL: Cache hit ratio too low | max(/PostgreSQL by Zabbix agent/pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) < {$PG.CACHE_HITRATIO.MIN.WARN} |
Warning | ||
PostgreSQL: Configuration has changed | last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0 |
Info | ||
PostgreSQL: Total number of connections is too high | min(/PostgreSQL by Zabbix agent/pgsql.connections.sum.total_pct,5m) > {$PG.CONN_TOTAL_PCT.MAX.WARN} |
Average | ||
PostgreSQL: Response too long | min(/PostgreSQL by Zabbix agent/pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.PING_TIME.MAX.WARN} |
Average | Depends on:
|
|
PostgreSQL: Service is down | last(/PostgreSQL by Zabbix agent/pgsql.ping["{$PG.HOST}","{$PG.PORT}"]) = 0 |
High | ||
PostgreSQL: Streaming lag with {#MASTER} is too high | min(/PostgreSQL by Zabbix agent/pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.REPL_LAG.MAX.WARN} |
Average | ||
PostgreSQL: Replication is down | max(/PostgreSQL by Zabbix agent/pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m)=0 |
Average | ||
PostgreSQL: Service has been restarted | PostgreSQL uptime is less than 10 minutes. |
last(/PostgreSQL by Zabbix agent/pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]) < 10m |
Info | |
PostgreSQL: Version has changed | last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0 |
Info |
LLD rule Database discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
Database discovery | Zabbix agent | pgsql.discovery.db["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Item prototypes for Database discovery
Name | Description | Type | Key and additional info |
---|---|---|---|
DB {#DBNAME}: Database size | Database size |
Zabbix agent | pgsql.db.size["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{#DBNAME}"] |
DB {#DBNAME}: Blocks hit per second | Total number of times disk blocks were found already in the buffer cache, so that a read was not necessary |
Dependent item | pgsql.dbstat.blks_hit.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Disk blocks read per second | Total number of disk blocks read in this database |
Dependent item | pgsql.dbstat.blks_read.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Detected conflicts per second | Total number of queries canceled due to conflicts with recovery in this database |
Dependent item | pgsql.dbstat.conflicts.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Detected deadlocks per second | Total number of detected deadlocks in this database |
Dependent item | pgsql.dbstat.deadlocks.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Temp_bytes written per second | Total amount of data written to temporary files by queries in this database |
Dependent item | pgsql.dbstat.temp_bytes.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Temp_files created per second | Total number of temporary files created by queries in this database |
Dependent item | pgsql.dbstat.temp_files.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Tuples deleted per second | Total number of rows deleted by queries in this database |
Dependent item | pgsql.dbstat.tup_deleted.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Tuples fetched per second | Total number of rows fetched by queries in this database |
Dependent item | pgsql.dbstat.tup_fetched.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Tuples inserted per second | Total number of rows inserted by queries in this database |
Dependent item | pgsql.dbstat.tup_inserted.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Tuples returned per second | Total number of rows updated by queries in this database |
Dependent item | pgsql.dbstat.tup_returned.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Tuples updated per second | Total number of rows updated by queries in this database |
Dependent item | pgsql.dbstat.tup_updated.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Commits per second | Number of transactions in this database that have been committed |
Dependent item | pgsql.dbstat.xact_commit.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Rollbacks per second | Total number of transactions in this database that have been rolled back |
Dependent item | pgsql.dbstat.xact_rollback.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Frozen XID before avtovacuum % | Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND |
Dependent item | pgsql.frozenxid.prc_before_av["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Frozen XID before stop % | Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND |
Dependent item | pgsql.frozenxid.prc_before_stop["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Get frozen XID | Zabbix agent | pgsql.frozenxid["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] | |
DB {#DBNAME}: Locks total | Total number of locks in the database |
Dependent item | pgsql.locks.total["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries slow maintenance count | Slow maintenance query count |
Dependent item | pgsql.queries.mro.slow_count["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries max maintenance time | Max maintenance query time |
Dependent item | pgsql.queries.mro.time_max["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries sum maintenance time | Sum maintenance query time |
Dependent item | pgsql.queries.mro.time_sum["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries slow query count | Slow query count |
Dependent item | pgsql.queries.query.slow_count["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries max query time | Max query time |
Dependent item | pgsql.queries.query.time_max["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries sum query time | Sum query time |
Dependent item | pgsql.queries.query.time_sum["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries slow transaction count | Slow transaction query count |
Dependent item | pgsql.queries.tx.slow_count["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries max transaction time | Max transaction query time |
Dependent item | pgsql.queries.tx.time_max["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Queries sum transaction time | Sum transaction query time |
Dependent item | pgsql.queries.tx.time_sum["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Index scans per second | Number of index scans in the database |
Dependent item | pgsql.scans.idx.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Sequential scans per second | Number of sequential scans in the database |
Dependent item | pgsql.scans.seq.rate["{#DBNAME}"] Preprocessing
|
DB {#DBNAME}: Get scans | Number of scans done for table/index in the database |
Zabbix agent | pgsql.scans["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] |
Trigger prototypes for Database discovery
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
DB {#DBNAME}: Too many recovery conflicts | The primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them.https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT |
min(/PostgreSQL by Zabbix agent/pgsql.dbstat.conflicts.rate["{#DBNAME}"],5m) > {$PG.CONFLICTS.MAX.WARN:"{#DBNAME}"} |
Average | |
DB {#DBNAME}: Deadlock occurred | min(/PostgreSQL by Zabbix agent/pgsql.dbstat.deadlocks.rate["{#DBNAME}"],5m) > {$PG.DEADLOCKS.MAX.WARN:"{#DBNAME}"} |
High | ||
DB {#DBNAME}: VACUUM FREEZE is required to prevent wraparound | Preventing Transaction ID Wraparound Failureshttps://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND |
last(/PostgreSQL by Zabbix agent/pgsql.frozenxid.prc_before_stop["{#DBNAME}"])<{$PG.FROZENXID_PCT_STOP.MIN.HIGH:"{#DBNAME}"} |
Average | |
DB {#DBNAME}: Number of locks is too high | min(/PostgreSQL by Zabbix agent/pgsql.locks.total["{#DBNAME}"],5m)>{$PG.LOCKS.MAX.WARN:"{#DBNAME}"} |
Warning | ||
DB {#DBNAME}: Too many slow queries | min(/PostgreSQL by Zabbix agent/pgsql.queries.query.slow_count["{#DBNAME}"],5m)>{$PG.SLOW_QUERIES.MAX.WARN:"{#DBNAME}"} |
Warning |
Feedback
Please report any issues with the template at https://support.zabbix.com
You can also provide feedback, discuss the template, or ask for help at ZABBIX forums