I'm noticing an issue I'm having where I periodically get spikes in my history syncer processes, that are causing slow queries/inserts/etc and false alerts due to a delay in data collection in the db, that even zabbix server itself is reporting not reachable. What's also strange is this seems to be occurring at the same time not only on the zabbix server, but on on all my proxies. All hosts being monitored are assigned to a proxy. It seems to occur every hour, and causes the most issue at night. Could this be an item or discovery rule causing this? I'm at a loss. All of these systems are on fast hardware with plenty of ram/cpu and showing no OS level related issues. DB has 12 CPU and 64 GB or RAM, and fast storage array. Tests on network interfaces and DB iostat show no issues. My last occurrence yesterday caused such an issue that the system was up to 12 hours behind in writing history to the db, resulting in me having to hard kill the zabbix process (losing all the data in the cache). The only way I could get it to recover was disable each proxy and bring them up one at a time until each queue was empty. System is running fine now but I did have another spike over night and period of delayed queries/deadlocks in the zabbix log.
Also strange, one of these proxies doesn't even have any hosts assigned to it, it's a standby.
Any ideas what to look for?
I have those hosts all split across 4 proxies
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=0
PidFile=/var/run/zabbix/zabbix_server.pid
SocketDir=/var/run/zabbix
StartPollers = 10
StartPollersUnreachable = 10
StartPingers = 10
StartDiscoverers = 10
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
CacheSize = 2G
HistoryCacheSize = 256M
HistoryIndexCacheSize = 2G
TrendCacheSize = 1G
ValueCacheSize = 2G
Timeout=4
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/bin/fping
Fping6Location=/usr/bin/fping6
LogSlowQueries=10000
StartProxyPollers=20
ProxyConfigFrequency = 90
ProxyDataFrequency = 1
StartDBSyncers = 4
Also strange, one of these proxies doesn't even have any hosts assigned to it, it's a standby.
Any ideas what to look for?
| Number of hosts (enabled/disabled/templates) | 913 | 822 / 2 / 89 |
| Number of items (enabled/disabled/not supported) | 355434 | 355036 / 196 / 202 |
| Number of triggers (enabled/disabled [problem/ok]) | 159955 | 159772 / 183 [158 / 159614] |
| Number of users (online) | 11 | 2 |
| Required server performance, new values per second | 1915.64 |
I have those hosts all split across 4 proxies
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=0
PidFile=/var/run/zabbix/zabbix_server.pid
SocketDir=/var/run/zabbix
StartPollers = 10
StartPollersUnreachable = 10
StartPingers = 10
StartDiscoverers = 10
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
CacheSize = 2G
HistoryCacheSize = 256M
HistoryIndexCacheSize = 2G
TrendCacheSize = 1G
ValueCacheSize = 2G
Timeout=4
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/bin/fping
Fping6Location=/usr/bin/fping6
LogSlowQueries=10000
StartProxyPollers=20
ProxyConfigFrequency = 90
ProxyDataFrequency = 1
StartDBSyncers = 4
Comment