Hi to everyone!
I am rolling out a Zabbix for large environment according to company request, and got stuck on a point with high zabbix queue on a server.
For testing purposes I installed server & Mariadb on the same server (VM), however later on i am planning to separate them with HA database. Currently, all hosts are collected by single proxy (as a VM) with the same specs as server has.
The server specifications i am using are...
OS: CentOS 7 (64bti)
CPUs: 8
RAM: 8GB
Disk Provisioned Size: 200 GB SSD
Zabbix proxy collects data every 5m, so at those peaks there are ~600 NVPS, however zabbix server queue reaches around 50k values, so cannot find out what could be the problem..
I have another zabbix server that collects 60% of those items without proxy, and it works fine without any queue. I checked NTP sync, there is no problem on that side as well.
What was the highest NVPS collected by single proxy in your practice?
Hope for a help or any directions from this community...
Below I pasted my zabbix_xx.conf
Thanks in advance!
I am rolling out a Zabbix for large environment according to company request, and got stuck on a point with high zabbix queue on a server.
For testing purposes I installed server & Mariadb on the same server (VM), however later on i am planning to separate them with HA database. Currently, all hosts are collected by single proxy (as a VM) with the same specs as server has.
The server specifications i am using are...
OS: CentOS 7 (64bti)
CPUs: 8
RAM: 8GB
Disk Provisioned Size: 200 GB SSD
| Zabbix server is running | Yes | xxxx |
| Number of hosts (enabled/disabled/templates) | 337 | 256 / 0 / 81 |
| Number of items (enabled/disabled/not supported) | 100729 | 98815 / 1 / 1913 |
| Number of triggers (enabled/disabled [problem/ok]) | 1893 | 1893 / 0 [19 / 1874] |
| Number of users (online) | 2 | 1 |
| Required server performance, new values per second | 80.94 |
I have another zabbix server that collects 60% of those items without proxy, and it works fine without any queue. I checked NTP sync, there is no problem on that side as well.
What was the highest NVPS collected by single proxy in your practice?
Hope for a help or any directions from this community...
Below I pasted my zabbix_xx.conf
| Server | Proxy | Mysql | |||||||||||
| LogFile=/var/log/zabbix/zabbix_server.log | Server=xxxx | [mysqld_safe] | |||||||||||
| LogFileSize=0 | Hostname=zabbix-proxy-1 | log-error=/var/log/mariadb/mariadb.log | |||||||||||
| PidFile=/var/run/zabbix/zabbix_server.pid | LogFile=/var/log/zabbix/zabbix_proxy.log | pid-file=/var/run/mariadb/mariadb.pid | |||||||||||
| SocketDir=/var/run/zabbix | LogFileSize=0 | [mysqld] | |||||||||||
| DBName=zabbix | PidFile=/var/run/zabbix/zabbix_proxy.pid | ||||||||||||
| DBUser=zabbix | SocketDir=/var/run/zabbix | ||||||||||||
| DBPassword=pass | DBName=zabbix_proxy | ||||||||||||
| StartPollers=32 | DBUser=zabbix | large-pages | |||||||||||
| StartPollersUnreachable=32 | DBPassword=xxxx | binlog-row-event-max-size= 8192 | |||||||||||
| SNMPTrapperFile=/var/log/snmptrap/snmptrap.log | StartPollers=500 | binlog-format = MIXED | |||||||||||
| HousekeepingFrequency=1 | StartPollersUnreachable=400 | character_set_server= utf8 | |||||||||||
| MaxHousekeeperDelete=500000 | SNMPTrapperFile=/var/log/snmptrap/snmptrap.log | collation_server = utf8_bin | |||||||||||
| CacheSize=2G | HousekeepingFrequency=1 | expire_logs_days = 1 | |||||||||||
| StartDBSyncers=30 | CacheSize=1G | join_buffer_size = 262144 | |||||||||||
| HistoryCacheSize=2G | StartDBSyncers=16 | max_allowed_packet= 32M | |||||||||||
| HistoryIndexCacheSize=2G | HistoryCacheSize=1G | max_connect_errors = 10000 | |||||||||||
| TrendCacheSize=2G | Timeout=30 | max_connections = 1500 | |||||||||||
| ValueCacheSize=2G | ExternalScripts=/usr/lib/zabbix/externalscripts | max_heap_table_size= 134217728 | |||||||||||
| Timeout=25 | LogSlowQueries=3000 | query_cache_size = 256M | |||||||||||
| AlertScriptsPath=/usr/lib/zabbix/alertscripts | table_open_cache = 2048 | ||||||||||||
| ExternalScripts=/usr/lib/zabbix/externalscripts | thread_cache_size = 64 | ||||||||||||
| LogSlowQueries=3000 | wait_timeout= 86400 | ||||||||||||
Comment