Dear all,
we are using two Zabbix 7.04 Servers in HA mode with 7 Proxies in total (each has mariadb installed) and a seperate mariabdb (tables are partioned) residing on a different box. All running on Redhat 8 VMs with each having 32 CPUs with 96G of RAM.
On 2 of the 7 Proxies we often have noticed high Queue with 300 - 800 Items longer than 10 Minutes.
We have ~1.6k VPS in total and and the 2 Proxies with the higher Queue have ~300 - 500 VPS each.
We have one Proxy with the highest ~600 VPS who is not having high queue problems which is a bit strange,
Number of hosts (enabled/disabled) 260 232 / 28
Number of templates 433
Number of items (enabled/disabled/not supported) 176413 171264 / 4789 / 360
Number of triggers (enabled/disabled [problem/ok]) 61585 59554 / 2031 [62 / 59492]
Mostly network devices are being monitored via SNMP and the delayed items are primarly delayed interface informations from Switches.
We have attached the proxy Template in Zabbix frontend to the Proxies (and made sure they are monitored by themselves) but there is nothing unusual to be seen
or high utilization noticed.
Does somebody have an idea how to bring the queue down?
###########################################
Zabbix Server Config:
StartDiscoverers=15
StartHTTPPollers=5
StartTimers=2
StartAlerters=5
StartEscalators=2
StartPreprocessors=250
StartTrappers=50
StartPingers=50
StartPollers=50
StartPreprocessors=250
ValueCacheSize=2G
TrendCacheSize=512M
HistoryIndexCacheSize=1024M
CacheSize=4G
VMwareCacheSize=128M
StartVMwareCollectors=25
Zabbix Proxies:
ProxyOfflineBuffer=24
ConfigFrequency=300
StartDBSyncers: 8
StartPingers: 5
StartPollers: 80
StartPreprocessors=10
StartPollersUnreachable: 5
StartPreprocessors: 10
StartSNMPTrapper: 1
StartVMwareCollectors: 10
Timeout: 4
CacheSize=4G
LogSlowQueries=3000
CacheSize: 4G
HistoryCacheSize: 1024M
HistoryIndexCacheSize: 1024M
VMwareCacheSize: 512M
ProxyBufferMode=hybrid
ProxyMemoryBufferSize=1G
I was thinking about using asynchronus Pollers for the items discovered via LLD because I have read that this may boost performance.
But this would mean adapting all the Templates. And also if I understood correctly this would mean that the Item prototypes will have the same
intervall than the master item.
BR
we are using two Zabbix 7.04 Servers in HA mode with 7 Proxies in total (each has mariadb installed) and a seperate mariabdb (tables are partioned) residing on a different box. All running on Redhat 8 VMs with each having 32 CPUs with 96G of RAM.
On 2 of the 7 Proxies we often have noticed high Queue with 300 - 800 Items longer than 10 Minutes.
We have ~1.6k VPS in total and and the 2 Proxies with the higher Queue have ~300 - 500 VPS each.
We have one Proxy with the highest ~600 VPS who is not having high queue problems which is a bit strange,
Number of hosts (enabled/disabled) 260 232 / 28
Number of templates 433
Number of items (enabled/disabled/not supported) 176413 171264 / 4789 / 360
Number of triggers (enabled/disabled [problem/ok]) 61585 59554 / 2031 [62 / 59492]
Mostly network devices are being monitored via SNMP and the delayed items are primarly delayed interface informations from Switches.
We have attached the proxy Template in Zabbix frontend to the Proxies (and made sure they are monitored by themselves) but there is nothing unusual to be seen
or high utilization noticed.
Does somebody have an idea how to bring the queue down?
###########################################
Zabbix Server Config:
StartDiscoverers=15
StartHTTPPollers=5
StartTimers=2
StartAlerters=5
StartEscalators=2
StartPreprocessors=250
StartTrappers=50
StartPingers=50
StartPollers=50
StartPreprocessors=250
ValueCacheSize=2G
TrendCacheSize=512M
HistoryIndexCacheSize=1024M
CacheSize=4G
VMwareCacheSize=128M
StartVMwareCollectors=25
Zabbix Proxies:
ProxyOfflineBuffer=24
ConfigFrequency=300
StartDBSyncers: 8
StartPingers: 5
StartPollers: 80
StartPreprocessors=10
StartPollersUnreachable: 5
StartPreprocessors: 10
StartSNMPTrapper: 1
StartVMwareCollectors: 10
Timeout: 4
CacheSize=4G
LogSlowQueries=3000
CacheSize: 4G
HistoryCacheSize: 1024M
HistoryIndexCacheSize: 1024M
VMwareCacheSize: 512M
ProxyBufferMode=hybrid
ProxyMemoryBufferSize=1G
I was thinking about using asynchronus Pollers for the items discovered via LLD because I have read that this may boost performance.
But this would mean adapting all the Templates. And also if I understood correctly this would mean that the Item prototypes will have the same
intervall than the master item.
BR
Comment