I have a fairly new installation we're using as a POC for Zabbix. I didn't typically see more than 95nvps on the master server and ~120nvps on a proxy server.
Suddenly, yesterday morning, most of the zabbix processes were at or near 100%. I don't quite know what else to do to drop them down. I tried increases the pollers,trappers,etc (they were higher than what I have listed below). I also tried tweaking mysql. I even removed the proxy and disabled monitoring of all hosts except the zabbix server without any change.
The mysql database is housed on another VM that currently only hosts mysql for zabbix.
Any troubleshooting tips? I have been seeing this in the zabbix_server.log file occasionally:
Relevant zabbix_server.conf
possibly relevant my.conf
Suddenly, yesterday morning, most of the zabbix processes were at or near 100%. I don't quite know what else to do to drop them down. I tried increases the pollers,trappers,etc (they were higher than what I have listed below). I also tried tweaking mysql. I even removed the proxy and disabled monitoring of all hosts except the zabbix server without any change.
The mysql database is housed on another VM that currently only hosts mysql for zabbix.
Any troubleshooting tips? I have been seeing this in the zabbix_server.log file occasionally:
Code:
[Z3005] query failed: [2013] Lost connection to MySQL server during query [begin;]
Code:
# Number of pre-forked instances of pollers. (DEFAULT=5) StartPollers=5 # Number of pre-forked instances of pollers for unreachable hosts (including IPMI). (DEFAULT=1) StartPollersUnreachable=2 # Number of pre-forked instances of ICMP pingers. (DEFAULT=1) StartPingers=2 # Number of pre-forked instances of discoverers. (DEAFULT=1) StartDiscoverers=1 # Number of pre-forked instances of DB Syncers. (DEFAULT=4) StartDBSyncers=4 # Timers process time-based trigger functions and maintenance periods. (DEFAULT=1) StartTimers=2 # Trappers accept incoming connections from Zabbix sender, active agents, active proxies and child nodes. (DEFAULT=5) StartTrappers=2 # Shared memory size for storing host, item and trigger data. (DEFAULT=8M) CacheSize=1G # Shared memory size for storing history data. (DEFAULT=8M) HistoryCacheSize=512M # Shared memory size for storing trends data. (DEFAULT=8M) TrendCacheSize=512M # Shared memory size for caching item history data requests. (DEFAULT=8M) ValueCacheSize=512M # Specifies how long we wait for agent, SNMP device or external check (in seconds). (DEFAULT=3) Timeout=10 SNMPTrapperFile=/run/zabbix/traps.tmp StartSNMPTrapper=1 # How often Zabbix will perform housekeeping procedure (in hours). Housekeeping is removing outdated information from the database HousekeepingFrequency=1 # Maximum number of rows to be deleted per task in each housekeeping cycle MaxHousekeeperDelete=100000
Code:
# open_files_limit = 65535 # wait_timeout = 600 # interactive_timeout = 1024 # # Maximum size of one packet or any generated/intermediate string max_allowed_packet = 64M # # Number of threads the server should cache for reuse thread_cache_size = 64 # # Maximum allowed number of simultaneous client connections max_connections = 256 # # Do not cache results that are larger than this number of bytes query_cache_limit = 16M # # Amount of memory allocated for caching query results query_cache_size = 1024M # # Minimum size (in bytes) for blocks allocated by the query cache query_cache_min_res_unit = 512 # # 0: do not cache # # 1: cache all cacheable query results except for those that begin with SELECT SQL_NO_CACHE # # 2: cache results only for cacheable queries that begin with SELECT SQL_CACHE query_cache_type = 1 # # If a query takes longer than this value (seconds), the server logs the query long_query_time = 5 # # Queries that are expected to retrieve all rows are logged log-queries-not-using-indexes # # Size in bytes of the memory buffer that InnoDB uses to cache data and indexes of its tables innodb_buffer_pool_size = 4096M
Comment