I am monitoring about 50 servers with Zabbix 1.5.3 Beta server and 1.4.4 agents. (yes, I know, it is beta)
All servers are being monitored fine with the exception of one. it is a Server 2003 machine that is running MS Sql server. An extremely busy (and not altogether healthy) system. About every 5 or 6 hours it stops reporting back to the Zabbix server and is throwing this error - ZBX_TCP_READ() failed [Interrupted system call]
When we have the DBA check the status of the Zabbix agent, it is still running on the host, but it needs to be restarted to re-acquire communication with Zabbix server.
Is there a way to circumvent this error through a timeout setting? if the agent is still running, why doesn't it pick up communications again when the host server load allows?
All servers are being monitored fine with the exception of one. it is a Server 2003 machine that is running MS Sql server. An extremely busy (and not altogether healthy) system. About every 5 or 6 hours it stops reporting back to the Zabbix server and is throwing this error - ZBX_TCP_READ() failed [Interrupted system call]
When we have the DBA check the status of the Zabbix agent, it is still running on the host, but it needs to be restarted to re-acquire communication with Zabbix server.
Is there a way to circumvent this error through a timeout setting? if the agent is still running, why doesn't it pick up communications again when the host server load allows?
Comment