Добрый день!
Проблема в следующем. Есть сервер, на котором установлен Zabbix Agent, используется активное получение данных. При высокой нагрузке на сервер теряются данные:
При такой нагрузке мы видим потери как на скриншоте. Zabbix Agent запускается с nice'ом:
ОС:
Настройки Zabbix Agent:
Собственно, как быть в данной ситуации? Дело в том, что с munin не обнаружено подобных проблем, может есть возможность как-то оптимизировать работу Zabbix'а?
Проблема в следующем. Есть сервер, на котором установлен Zabbix Agent, используется активное получение данных. При высокой нагрузке на сервер теряются данные:
Code:
CPU | sys 1014% | user 498% | irq 33% | | | idle 860% | wait 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 40% | user 36% | irq 0% | | | idle 24% | cpu017 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 45% | user 30% | irq 0% | | | idle 25% | cpu013 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 40% | user 32% | irq 0% | | | idle 28% | cpu022 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 44% | user 27% | irq 0% | | | idle 29% | cpu018 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 43% | user 26% | irq 0% | | | idle 31% | cpu019 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 43% | user 26% | irq 0% | | | idle 31% | cpu014 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 43% | user 24% | irq 0% | | | idle 33% | cpu021 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 39% | user 27% | irq 0% | | | idle 33% | cpu020 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 43% | user 23% | irq 0% | | | idle 34% | cpu023 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 40% | user 25% | irq 0% | | | idle 35% | cpu016 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 46% | user 19% | irq 0% | | | idle 35% | cpu009 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 44% | user 20% | irq 0% | | | idle 35% | cpu008 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 44% | user 20% | irq 0% | | | idle 36% | cpu015 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 45% | user 19% | irq 0% | | | idle 36% | cpu010 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 45% | user 17% | irq 0% | | | idle 38% | cpu011 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 37% | user 24% | irq 0% | | | idle 39% | cpu012 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 44% | user 11% | irq 5% | | | idle 40% | cpu002 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 40% | user 16% | irq 4% | | | idle 40% | cpu007 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 42% | user 11% | irq 5% | | | idle 42% | cpu000 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 40% | user 13% | irq 5% | | | idle 42% | cpu004 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 39% | user 15% | irq 4% | | | idle 42% | cpu005 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 41% | user 12% | irq 4% | | | idle 43% | cpu001 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 42% | user 11% | irq 4% | | | idle 43% | cpu003 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | cpu | sys 42% | user 12% | irq 2% | | | idle 44% | cpu006 w 0% | | steal 0% | | guest 0% | curf 2.50GHz | curscal 99% | CPL | avg1 20.19 | | avg5 13.27 | avg15 9.08 | | | | csw 373478 | | intr 85539 | | | numcpu 24 |
Code:
root@server:/var/log/atop # ps -ax -o nice,command | grep zabbix -16 /usr/local/sbin/zabbix_agentd -16 zabbix_agentd: collector [idle 1 sec] (zabbix_agentd) -16 zabbix_agentd: listener #1 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #2 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #3 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #4 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #5 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #6 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #7 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: listener #8 [waiting for connection] (zabbix_agentd) -16 zabbix_agentd: active checks #1 [processing active checks] (zabbix_agentd) -16 zabbix_agentd: active checks #2 [processing active checks] (zabbix_agentd)
Code:
root@server:/var/log/atop # uname -a FreeBSD server10.freeteam.org 9.3-RELEASE-p33 FreeBSD 9.3-RELEASE-p33 #0: Wed Jan 13 17:55:39 UTC 2016 [email protected]:/usr/obj/usr/src/sys/GENERIC amd64
Code:
root@server:/usr/local/etc/zabbix24 # grep -vE '^#|^$' zabbix_agentd.conf LogFile=/var/log/zabbix/zabbix_agentd.log LogFileSize=0 DebugLevel=4 EnableRemoteCommands=1 LogRemoteCommands=1 Server=XXX.XXX.XXX.XXX StartAgents=8 ServerActive=XXX.XXX.XXX.XXX HostnameItem=system.hostname HostMetadataItem=system.uname BufferSize=4096 MaxLinesPerSecond=1000 Timeout=30 Include=/usr/local/etc/zabbix24/zabbix_agentd.conf.d/*_freebsd.conf