I am in the process of merging two separate Zabbix environments with a new larger system. Currently, the server resides in AWS and is experiencing no performance problems what so ever and has approximately 1700 hosts and is running about 1900nvps. The proxy servers, of which there are 20, generally are not experiencing any performance issues, except one which is currently monitoring about 480 hosts. The proxy servers are all running the following specifications:
CentOS 7
Zabbix v3.4.10
4 CPU (a couple have 8)
16GB memory (a couple have 24GB)
All of the proxy servers have a loop back network interface with the same IP address that gets routed to the server so that regardless of where you are in our network the agent config files all use the same Server and ServerActive IP address. This works really great by the way....
and they all have the following configurations:
CacheSize=512M
ConfigFrequency=300
DBName=zabbix
DBPassword=zabbixaws
DBSocket=/var/lib/mysql/mysql.sock
DBUser=zabbix
DataSenderFrequency=5
DebugLevel=3
EnableRemoteCommands=1
ExternalScripts=/usr/lib/zabbix/externalscripts
HeartbeatFrequency=60
Hostname=--------zbprx01
HostnameItem=--------zbprx01
HousekeepingFrequency=1
ListenIP=xx.xxx.x.170,yy.yy.yyy.50
LogFile=/var/log/zabbix/zabbix_proxy.log
LogFileSize=512
LogSlowQueries=3000
PidFile=/var/run/zabbix/zabbix_proxy.pid
ProxyMode=0
ProxyOfflineBuffer=24
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
Server=xx.xxx.x.xxx
SocketDir=/var/run/zabbix
SourceIP=xx.xxx.x.170
StartDBSyncers=4
StartDiscoverers=10
StartHTTPPollers=10
StartIPMIPollers=5
StartPingers=30
StartPollers=40
StartPollersUnreachable=30
StartTrappers=5
StartVMwareCollectors=1
Timeout=30
UnavailableDelay=30
UnreachableDelay=30
VMwareCacheSize=256M
xx.xxx.x.170 is the IP address for the Loop back interface and yy.yy.yyy.50 is the actual IP address of the proxy server. I've also omitted the hostname of the proxy for privacy sake so that's what --------zbprx01 is.
The proxy server that I am experiencing issues with has the StartPollers set to 50.
I can't seem to get the Zabbix busy poller below about 80% on average. If I were done merging systems i wouldn't be too concerned but I still have about 3000 hosts between the two older systems to migrate so I'm a bit concerned about performance going forward and would really like to know how to scale this up more.
CentOS 7
Zabbix v3.4.10
4 CPU (a couple have 8)
16GB memory (a couple have 24GB)
All of the proxy servers have a loop back network interface with the same IP address that gets routed to the server so that regardless of where you are in our network the agent config files all use the same Server and ServerActive IP address. This works really great by the way....
and they all have the following configurations:
CacheSize=512M
ConfigFrequency=300
DBName=zabbix
DBPassword=zabbixaws
DBSocket=/var/lib/mysql/mysql.sock
DBUser=zabbix
DataSenderFrequency=5
DebugLevel=3
EnableRemoteCommands=1
ExternalScripts=/usr/lib/zabbix/externalscripts
HeartbeatFrequency=60
Hostname=--------zbprx01
HostnameItem=--------zbprx01
HousekeepingFrequency=1
ListenIP=xx.xxx.x.170,yy.yy.yyy.50
LogFile=/var/log/zabbix/zabbix_proxy.log
LogFileSize=512
LogSlowQueries=3000
PidFile=/var/run/zabbix/zabbix_proxy.pid
ProxyMode=0
ProxyOfflineBuffer=24
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
Server=xx.xxx.x.xxx
SocketDir=/var/run/zabbix
SourceIP=xx.xxx.x.170
StartDBSyncers=4
StartDiscoverers=10
StartHTTPPollers=10
StartIPMIPollers=5
StartPingers=30
StartPollers=40
StartPollersUnreachable=30
StartTrappers=5
StartVMwareCollectors=1
Timeout=30
UnavailableDelay=30
UnreachableDelay=30
VMwareCacheSize=256M
xx.xxx.x.170 is the IP address for the Loop back interface and yy.yy.yyy.50 is the actual IP address of the proxy server. I've also omitted the hostname of the proxy for privacy sake so that's what --------zbprx01 is.
The proxy server that I am experiencing issues with has the StartPollers set to 50.
I can't seem to get the Zabbix busy poller below about 80% on average. If I were done merging systems i wouldn't be too concerned but I still have about 3000 hosts between the two older systems to migrate so I'm a bit concerned about performance going forward and would really like to know how to scale this up more.
Comment