Just noticed an issue in our Zabbix box that seems to indicate a problem with retrieving data from hosts while one other host is unavailable.
Starting at 12:39pm today one host went unavailable with errors in the server log of "Getting value of...failed". This continued until 12:50 when it started reporting "Timeout while connecting to [....]\nHost [...] will be checked after [60] seconds." The Zabbix agent log reported "Timeout while answering request" at 12:39 with no other entries after that.
During that same time period Zabbix collected no other data from any other hosts. It looks like the availability of one host was blocking requests for items from other hosts.
I'll be happy to provide logs and assist in troubleshooting as this seems like a pretty big issue in my opinion.
-cameron
Starting at 12:39pm today one host went unavailable with errors in the server log of "Getting value of...failed". This continued until 12:50 when it started reporting "Timeout while connecting to [....]\nHost [...] will be checked after [60] seconds." The Zabbix agent log reported "Timeout while answering request" at 12:39 with no other entries after that.
During that same time period Zabbix collected no other data from any other hosts. It looks like the availability of one host was blocking requests for items from other hosts.
I'll be happy to provide logs and assist in troubleshooting as this seems like a pretty big issue in my opinion.
-cameron
). My solution was the same, disable the offending host, let Zabbix catch up, and then reinstate the offending host once it's error has been corrected. I haven't tried a version of Zabbix past beta1, but I do hope this is resolved before 1.1 is released since Zabbix should be able to handle a host breaking down without failing itself.
Comment