YA "Zabbix Agent Unreachable" false alarm question

wsanders

Junior Member

Joined: Feb 2014

Posts: 7
#1

YA "Zabbix Agent Unreachable" false alarm question

20-02-2014, 03:43

Is it the configuration parameter "Timeout" or "Unavailable" that triggers the "no ping" condition that throws this alert?

We have about 1 or 2 instances per day of "Zabbix Agent unreachable" alerts on random servers. During this time, Zabbix logs lines like this every two minutes:

19636:20140219:154735.205 cannot send list of active checks to [10.0.4.172]: host [blahblah] not monitored

No other log entries are produced from the host. The host is up, but busy, with, perhaps, high CPU utilization (we can't tell because Zabbix doesn't get data during this period, which can last up to 30 minutes.) For some reason, we get the "unreachable" alert and the "OK" recovery at the same time, only after Zabbix is able to poll the host again, so it's hard to catch the server "in the act" with top or ps..

Both Timeout and Unavailable are set to defaults: Timeout=3 and UnreachablePeriod=45. I think I will tune Timeout up to 6 and UnreachablePeriod up to 120. Does that sound like a good approach?

Thanks
w
Tags: unreachable tuning
wsanders

Junior Member

Joined: Feb 2014

Posts: 7
#2

26-02-2014, 22:47

Solution: YA "Zabbix Agent Unreachable" false alarm question

We found these false alarms were occurring at the end of maintenance periods during which data was not collected. Once we began collecting data during the maintenance period, the false alarms went away.

We also increased the server-side Timeout parameter from 3 to 10, but the former seems to have been the main cause of our false alarms.
Comment

Ad Widget

YA "Zabbix Agent Unreachable" false alarm question

YA "Zabbix Agent Unreachable" false alarm question

Comment