Hi,
I have started working with Zabbix as an alternative to the commercial monitoring tools out there. I have installed the Server, Proxy, DB & Portal on seperate vm's.
The server has four proxies connected to it, but the queue seems to be growing steadily. It seems as if the server can't manage the items sent to it. I need some advice in the following:
1) How does data flow from the host that is monitored to ultimately where it lands up in the DB. Too understand how Zabbix works would help me fault find it.
2) Is my config for the server, proxy and database are correct? If not, what can I change it to so that the queue is lessened and the server is "faster"?
3) Some of my graphs, especially calculated items in a graph, have missing spaces or blanks in it. Why?
4) Trigger take a long time to send alerts via sms/email e.g. I will stop an agent and then only after 20-30minutes will I get an alert via email/sms, sometimes it takes hours.
Config:
VMWare Server:
Dual Xeon CPU, 32GB memory, 3 x 1TB SATA drives in RAID5 config. This is the main server which has VMWare ESXi 5.1 loaded on it.
Individual VM's:
- MYSQL Database:
- 16GB Memory
- 500GB HD
- 2 x vCPU
- Database is not partitioned
- O/S: CentOS 6.5 64-bit
- MySQL 5.1 64-bit
- Zabbix Server:
- 4GB Memory
- 2 x vCPU
- 60GB HD
- O/S: CentOS 6.5 64-bit
- Zabbix Proxy:
- 2GB Memory
- 2 x vCPU
- 40GB HD
- O/S: CentOS 6.5 64-bit
- MySQL 5.1 64-bit
- Zabbix Portal:
- 2GB Memory
- 1 x vCPU
- 40GB HD
- Apache 2.2
- php5.3 with php-xcache
MySQL Server my.cnf:
Zabbix Server Config:
Zabbix Proxy Config:
Zabbix Proxy my.cnf:
Some general info:
- The Proxy server is a remote proxy
- The Proxy server monitors SNMP devices, agents and Web Response Time for a couple for servers
- The Server has a total of 102 hosts, 2353 items and 740 triggers and has a nvps of 29.48
See attached screenshots for the cache, queue, busy process, etc.
I would realy appreciate any input from the guru's and developers of Zabbix as to where I'm going wrong here.
We intend to add 3000+ hosts in the near future, but if the systems seems a bit under pressure from 102 hosts then I need to re-evaluate. In all fairness, it all might be my misunderstanding of the settings in the config files.
Your help will be appreciated.
Regards,
Dawid
I have started working with Zabbix as an alternative to the commercial monitoring tools out there. I have installed the Server, Proxy, DB & Portal on seperate vm's.
The server has four proxies connected to it, but the queue seems to be growing steadily. It seems as if the server can't manage the items sent to it. I need some advice in the following:
1) How does data flow from the host that is monitored to ultimately where it lands up in the DB. Too understand how Zabbix works would help me fault find it.
2) Is my config for the server, proxy and database are correct? If not, what can I change it to so that the queue is lessened and the server is "faster"?
3) Some of my graphs, especially calculated items in a graph, have missing spaces or blanks in it. Why?
4) Trigger take a long time to send alerts via sms/email e.g. I will stop an agent and then only after 20-30minutes will I get an alert via email/sms, sometimes it takes hours.
Config:
VMWare Server:
Dual Xeon CPU, 32GB memory, 3 x 1TB SATA drives in RAID5 config. This is the main server which has VMWare ESXi 5.1 loaded on it.
Individual VM's:
- MYSQL Database:
- 16GB Memory
- 500GB HD
- 2 x vCPU
- Database is not partitioned
- O/S: CentOS 6.5 64-bit
- MySQL 5.1 64-bit
- Zabbix Server:
- 4GB Memory
- 2 x vCPU
- 60GB HD
- O/S: CentOS 6.5 64-bit
- Zabbix Proxy:
- 2GB Memory
- 2 x vCPU
- 40GB HD
- O/S: CentOS 6.5 64-bit
- MySQL 5.1 64-bit
- Zabbix Portal:
- 2GB Memory
- 1 x vCPU
- 40GB HD
- Apache 2.2
- php5.3 with php-xcache
MySQL Server my.cnf:
Code:
[mysqld] bind-address=192.168.11.3 datadir=/app/mysql #/var/lib/mysql socket=/app/mysql/mysql.sock #/var/lib/mysql/mysql.sock user=mysql # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 tmpdir=/tmp # Custom Settings log_queries_not_using_indexes=1 # GENERAL # default-storage-engine = InnoDB # MyISAM # key-buffer-size = 20M myisam-recover = FORCE,BACKUP # SAFETY # max-allowed-packet = 64M max-connect-errors = 1000000 innodb = FORCE # BINARY LOGGING # log-bin = /app/mysql/mysql-bin expire-logs-days = 14 sync-binlog = 1 # CACHES AND LIMITS # tmp-table-size = 128M #32M max-heap-table-size = 128M #32M query-cache-type = 1 query-cache-size = 128M #16M query-cache-limit = 128M max-connections = 500 thread-cache-size = 300 open-files-limit = 65535 table-definition-cache = 4096 table-open-cache = 4096 table-cache = 512 #new join-buffer-size = 4M #2 read-buffer-size = 512k #new read-rnd-buffer-size = 512k #new # INNODB # innodb-flush-method = O_DIRECT innodb-log-files-in-group = 2 innodb-log-file-size = 256M innodb-flush-log-at-trx-commit = 2 #1 innodb-file-per-table = 1 innodb-buffer-pool-size = 12G innodb-log-buffer-size = 4M innodb-thread-concurrency = 0 #16 # LOGGING # log-error = /app/mysql/mysql-error.log log-queries-not-using-indexes = 1 long_query_time = 1 slow-query-log = 1
Code:
DBSocket=/var/lib/mysql/mysql.sock StartPollers=80 StartIPMIPollers=5 StartPollersUnreachable=5 StartTrappers=40 StartPingers=5 StartDiscoverers=5 StartHTTPPollers=10 StartTimers=15 StartVMwareCollectors=5 VMwareFrequency=60 VMwareCacheSize=8M HousekeepingFrequency=1 MaxHousekeeperDelete=500 SenderFrequency=15 CacheSize=256M CacheUpdateFrequency=60 StartDBSyncers=16 HistoryCacheSize=128M TrendCacheSize=64M HistoryTextCacheSize=64M ValueCacheSize=64M Timeout=5 TrapperTimeout=120 UnreachablePeriod=45 UnavailableDelay=60 UnreachableDelay=15
Code:
ProxyMode=0 DBSocket=/app/mysql/mysql.sock HeartbeatFrequency=60 ConfigFrequency=300 DataSenderFrequency=5 StartPollers=30 StartPollersUnreachable=5 StartTrappers=15 StartPingers=10 StartDiscoverers=5 StartHTTPPollers=5 HousekeepingFrequency=1 CacheSize=16M StartDBSyncers=8 HistoryCacheSize=32M HistoryTextCacheSize=16M Timeout=15 TrapperTimeout=120 UnreachablePeriod=45 UnavailableDelay=60 UnreachableDelay=15
Code:
[mysqld] # General # datadir = /app/mysql socket = /app/mysql/mysql.sock user = mysql symbolic-links = 0 #default-storage-engine=InnoDB interactive_timeout = 12000 #wait_timeout=300 # Custom MYSQL settings for Zabbix query_cache_size = 32M query_cache_type = 1 query_cache_limit = 32M thread_cache_size = 128 table_cache = 512 max_connections = 500 wait_timeout = 600 key_buffer_size = 10M innodb_buffer_pool_size = 16M slow-query-log = 1 slow-query-log-file = /app/mysql/mysql-slow.log join_buffer_size = 512K table_cache = 128 long_query_time = 2 innodb-flush-method = O_DIRECT innodb-log-files-in-group = 2 innodb-log-file-size = 128M innodb-log-buffer-size = 4M innodb-flush-log-at-trx-commit = 1 innodb-file-per-table = 1 innodb-buffer-pool-size = 512M innodb-thread-concurrency = 8 [mysqld_safe] log-error = /var/log/mysqld.log pid-file = /app/mysql/mysqld.pid
- The Proxy server is a remote proxy
- The Proxy server monitors SNMP devices, agents and Web Response Time for a couple for servers
- The Server has a total of 102 hosts, 2353 items and 740 triggers and has a nvps of 29.48
See attached screenshots for the cache, queue, busy process, etc.
I would realy appreciate any input from the guru's and developers of Zabbix as to where I'm going wrong here.
We intend to add 3000+ hosts in the near future, but if the systems seems a bit under pressure from 102 hosts then I need to re-evaluate. In all fairness, it all might be my misunderstanding of the settings in the config files.
Your help will be appreciated.
Regards,
Dawid





Comment