Hello!
DB size: 29Gb on disk
Required server performance, new values per second 286.71
Number of items (enabled/disabled/not supported) 51521 / 75810 / 3488
Number of triggers (enabled/disabled [problem/ok]) 5090 / 7390 [23 / 5067]
Overal memory consumption:
# free -m
total used free shared buff/cache available
Mem: 11781 6954 161 2664 4665 1869
Swap: 0 0 0
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
24180b6e5435 zb-web-nginx 0.00% 506.9MiB / 1.266GiB 39.12% 2.34GB / 1.78GB 74.7MB / 24.6kB 38
8fc4483c721a zb-server 0.00% 750.2MiB / 1.841GiB 39.80% 0B / 0B 34.4MB / 56.8kB 96
1476713ee127 zb-postgres 0.00% 6.738GiB / 7.593GiB 88.74% 0B / 0B 7.59GB / 487GB 79
dfd6c5944431 zb-java-gw 0.00% 180.2MiB / 11.5GiB 1.53% 0B / 0B 125MB / 0B 25
c632fd9b5ef8 zb-snmptraps 0.00% 19.12MiB / 11.5GiB 0.16% 0B / 0B 33.1MB / 83.5kB 2
PostgreSQL version: PostgreSQL 13.1 (Debian 13.1-1.pgdg100+1 - official image from dockerhub.
Zabbix version:
zabbix_server (Zabbix) 5.0.3
Revision 146855b 24 August 2020, compilation time: Sep 8 2020 17:49:26
While evaluating memory consumption of DB I found (using netstat and ps) that majority of memory is consumed by postgresql processes bound to lld workers (I have default StartLLDProcessors=2)
VIRT RES SHR CODE DATA %MEM COMMAND
4478.8m 2.4g 151.5m 5.4m 2328.0m 21.0 postgres: zabbix zabbix 10.139.2.14(53344) idle
4451.6m 2.4g 150.3m 5.4m 2300.8m 20.8 postgres: zabbix zabbix 10.139.2.14(53346) idle
2109.6m 1.9g 1.9g 5.4m 17.0m 16.9 postgres: zabbix zabbix 10.139.2.14(53292) idle
2111.4m 1.9g 1.9g 5.4m 18.8m 16.9 postgres: zabbix zabbix 10.139.2.14(53288) idle
2109.3m 1.9g 1.9g 5.4m 16.7m 16.9 postgres: zabbix zabbix 10.139.2.14(53280) idle
2105.4m 1.9g 1.9g 5.4m 12.8m 16.9 postgres: zabbix zabbix 10.139.2.14(53290) idle
2094.2m 1.9g 1.9g 5.4m 1.6m 16.8 postgres: checkpointer
2097.4m 97.0m 93.0m 5.4m 4.8m 0.8 postgres: zabbix zabbix 10.139.2.14(53276) idle
2096.4m 89.6m 86.0m 5.4m 3.8m 0.8 postgres: zabbix zabbix 10.139.2.14(53348) idle
2097.9m 88.2m 83.4m 5.4m 5.3m 0.7 postgres: zabbix zabbix 10.139.2.14(53240) idle
2094.0m 82.1m 80.4m 5.4m 1.3m 0.7 postgres
2094.2m 66.7m 65.0m 5.4m 1.5m 0.6 postgres: background writer
ss -ntp | egrep '53344|53346'
ESTAB 0 0 10.139.2.14:53346 10.139.2.14:5432 users("zabbix_server",pid=2320050,fd=7))
ESTAB 0 0 10.139.2.14:5432 10.139.2.14:53346 users("postgres",pid=2320084,fd=9))
ESTAB 0 0 10.139.2.14:5432 10.139.2.14:53344 users("postgres",pid=2320083,fd=9))
ESTAB 0 0 10.139.2.14:53344 10.139.2.14:5432 users("zabbix_server",pid=2320049,fd=7))
cat /proc/{2320050,2320049}/cmdline
/usr/sbin/zabbix_server: lld worker #2 [processed 1 LLD rules, idle 5.929359 sec during 5.944900 sec]
/usr/sbin/zabbix_server: lld worker #1 [processed 1 LLD rules, idle 5.349982 sec during 5.420240 sec]
There is a number of templated installation with same issues: the memory used by LLD connection does not looks shared.
DB configuration:
max_connections= 200
shared_buffers= 1944MB
work_mem= 8MB
maintenance_work_mem= 505MB
effective_cache_size= 5870MB
idle_in_transaction_session_timeout= 60s
vacuum_cost_delay= 0
vacuum_cost_page_hit= 0
vacuum_cost_page_miss= 5
vacuum_cost_page_dirty= 5
vacuum_cost_limit= 200
autovacuum_max_workers= 4
autovacuum_naptime= 1s
autovacuum_vacuum_threshold= 50
autovacuum_analyze_threshold= 50
autovacuum_vacuum_scale_factor= 0.05
autovacuum_analyze_scale_factor= 0.05
autovacuum_vacuum_cost_delay= 5ms
autovacuum_vacuum_cost_limit= -1
bgwriter_lru_multiplier= 4
wal_level= logical
wal_buffers= 16MB
min_wal_size= 1GB
max_wal_size= 4GB
max_worker_processes= 4
max_parallel_workers_per_gather= 2
max_parallel_workers= 4
max_parallel_maintenance_workers= 2
listen_addresses= '*'
effective_io_concurrency= 200
default_statistics_target= 100
checkpoint_completion_target= 0.8
random_page_cost= 1.1
pg_partman_bgw.interval= 10800
shared_preload_libraries = 'pg_partman_bgw'
enable_partition_pruning = on
pg_partman_bgw.role = 'zabbix'
pg_partman_bgw.dbname = 'zabbix'
pg_partman_bgw.analyze = off
pg_partman_bgw.jobmon = on
At the moment I have no idea why PostgreSQL does not 'share' memory from LLD connections with other connection. Practically it means that memory consumption grows daily by 0.5-1.5Gb until OOM. I will stop one LLD process just to see if I memory leak speed will be changed.
I appreciate any help.
DB size: 29Gb on disk
Required server performance, new values per second 286.71
Number of items (enabled/disabled/not supported) 51521 / 75810 / 3488
Number of triggers (enabled/disabled [problem/ok]) 5090 / 7390 [23 / 5067]
Overal memory consumption:
# free -m
total used free shared buff/cache available
Mem: 11781 6954 161 2664 4665 1869
Swap: 0 0 0
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
24180b6e5435 zb-web-nginx 0.00% 506.9MiB / 1.266GiB 39.12% 2.34GB / 1.78GB 74.7MB / 24.6kB 38
8fc4483c721a zb-server 0.00% 750.2MiB / 1.841GiB 39.80% 0B / 0B 34.4MB / 56.8kB 96
1476713ee127 zb-postgres 0.00% 6.738GiB / 7.593GiB 88.74% 0B / 0B 7.59GB / 487GB 79
dfd6c5944431 zb-java-gw 0.00% 180.2MiB / 11.5GiB 1.53% 0B / 0B 125MB / 0B 25
c632fd9b5ef8 zb-snmptraps 0.00% 19.12MiB / 11.5GiB 0.16% 0B / 0B 33.1MB / 83.5kB 2
PostgreSQL version: PostgreSQL 13.1 (Debian 13.1-1.pgdg100+1 - official image from dockerhub.
Zabbix version:
zabbix_server (Zabbix) 5.0.3
Revision 146855b 24 August 2020, compilation time: Sep 8 2020 17:49:26
While evaluating memory consumption of DB I found (using netstat and ps) that majority of memory is consumed by postgresql processes bound to lld workers (I have default StartLLDProcessors=2)
VIRT RES SHR CODE DATA %MEM COMMAND
4478.8m 2.4g 151.5m 5.4m 2328.0m 21.0 postgres: zabbix zabbix 10.139.2.14(53344) idle
4451.6m 2.4g 150.3m 5.4m 2300.8m 20.8 postgres: zabbix zabbix 10.139.2.14(53346) idle
2109.6m 1.9g 1.9g 5.4m 17.0m 16.9 postgres: zabbix zabbix 10.139.2.14(53292) idle
2111.4m 1.9g 1.9g 5.4m 18.8m 16.9 postgres: zabbix zabbix 10.139.2.14(53288) idle
2109.3m 1.9g 1.9g 5.4m 16.7m 16.9 postgres: zabbix zabbix 10.139.2.14(53280) idle
2105.4m 1.9g 1.9g 5.4m 12.8m 16.9 postgres: zabbix zabbix 10.139.2.14(53290) idle
2094.2m 1.9g 1.9g 5.4m 1.6m 16.8 postgres: checkpointer
2097.4m 97.0m 93.0m 5.4m 4.8m 0.8 postgres: zabbix zabbix 10.139.2.14(53276) idle
2096.4m 89.6m 86.0m 5.4m 3.8m 0.8 postgres: zabbix zabbix 10.139.2.14(53348) idle
2097.9m 88.2m 83.4m 5.4m 5.3m 0.7 postgres: zabbix zabbix 10.139.2.14(53240) idle
2094.0m 82.1m 80.4m 5.4m 1.3m 0.7 postgres
2094.2m 66.7m 65.0m 5.4m 1.5m 0.6 postgres: background writer
ss -ntp | egrep '53344|53346'
ESTAB 0 0 10.139.2.14:53346 10.139.2.14:5432 users("zabbix_server",pid=2320050,fd=7))
ESTAB 0 0 10.139.2.14:5432 10.139.2.14:53346 users("postgres",pid=2320084,fd=9))
ESTAB 0 0 10.139.2.14:5432 10.139.2.14:53344 users("postgres",pid=2320083,fd=9))
ESTAB 0 0 10.139.2.14:53344 10.139.2.14:5432 users("zabbix_server",pid=2320049,fd=7))
cat /proc/{2320050,2320049}/cmdline
/usr/sbin/zabbix_server: lld worker #2 [processed 1 LLD rules, idle 5.929359 sec during 5.944900 sec]
/usr/sbin/zabbix_server: lld worker #1 [processed 1 LLD rules, idle 5.349982 sec during 5.420240 sec]
There is a number of templated installation with same issues: the memory used by LLD connection does not looks shared.
DB configuration:
max_connections= 200
shared_buffers= 1944MB
work_mem= 8MB
maintenance_work_mem= 505MB
effective_cache_size= 5870MB
idle_in_transaction_session_timeout= 60s
vacuum_cost_delay= 0
vacuum_cost_page_hit= 0
vacuum_cost_page_miss= 5
vacuum_cost_page_dirty= 5
vacuum_cost_limit= 200
autovacuum_max_workers= 4
autovacuum_naptime= 1s
autovacuum_vacuum_threshold= 50
autovacuum_analyze_threshold= 50
autovacuum_vacuum_scale_factor= 0.05
autovacuum_analyze_scale_factor= 0.05
autovacuum_vacuum_cost_delay= 5ms
autovacuum_vacuum_cost_limit= -1
bgwriter_lru_multiplier= 4
wal_level= logical
wal_buffers= 16MB
min_wal_size= 1GB
max_wal_size= 4GB
max_worker_processes= 4
max_parallel_workers_per_gather= 2
max_parallel_workers= 4
max_parallel_maintenance_workers= 2
listen_addresses= '*'
effective_io_concurrency= 200
default_statistics_target= 100
checkpoint_completion_target= 0.8
random_page_cost= 1.1
pg_partman_bgw.interval= 10800
shared_preload_libraries = 'pg_partman_bgw'
enable_partition_pruning = on
pg_partman_bgw.role = 'zabbix'
pg_partman_bgw.dbname = 'zabbix'
pg_partman_bgw.analyze = off
pg_partman_bgw.jobmon = on
At the moment I have no idea why PostgreSQL does not 'share' memory from LLD connections with other connection. Practically it means that memory consumption grows daily by 0.5-1.5Gb until OOM. I will stop one LLD process just to see if I memory leak speed will be changed.
I appreciate any help.