Version:
Zabbix 1.1 (server and agent)
Server OS:
SuSE Enterprise Linux 9 SP3 (9.3)
Client OS:
SuSE Enterprise Linux 9 SP3 (9.3)
Problem:
Zabbix server loses all connectivity to Zabbix agent when SNMP monitored items do not respond (i.e. the application which reports status via SNMP is not running).
Details:
- Server and agent were working fine with a mix of SNMP and agent items
- Server was rebooted to implement new changes to monitored app; the app failed to restart, so SNMP data were no longer available
- Zabbix server could no longer connect to the agent. Message: "Timeout while connecting to [host.ourdomain.com]:nnn" (where nnn=the port number which we use for SNMP monitoring)
- Telnet to the host on port 10050 worked, and returned a ZBX error
Workaround:
Disabling all SNMP items for the monitored host resulted in the Zabbix server being able to immediately establish a connection to the agent and once again start collecting data.
Anyone else experiencing this bug?
-Rob
PS: Here's a link to my response to another post about this problem:
Zabbix 1.1 (server and agent)
Server OS:
SuSE Enterprise Linux 9 SP3 (9.3)
Client OS:
SuSE Enterprise Linux 9 SP3 (9.3)
Problem:
Zabbix server loses all connectivity to Zabbix agent when SNMP monitored items do not respond (i.e. the application which reports status via SNMP is not running).
Details:
- Server and agent were working fine with a mix of SNMP and agent items
- Server was rebooted to implement new changes to monitored app; the app failed to restart, so SNMP data were no longer available
- Zabbix server could no longer connect to the agent. Message: "Timeout while connecting to [host.ourdomain.com]:nnn" (where nnn=the port number which we use for SNMP monitoring)
- Telnet to the host on port 10050 worked, and returned a ZBX error
Workaround:
Disabling all SNMP items for the monitored host resulted in the Zabbix server being able to immediately establish a connection to the agent and once again start collecting data.
Anyone else experiencing this bug?
-Rob
PS: Here's a link to my response to another post about this problem:
Comment