Hi all.
I'm trying to get around a somehow 'classic' problem : how to avoid or reduce false positives due to network topology ...
In order to limit the problem of 'router down' => 'whole network is down' => alert storm, I've setup dependencies.
What I have right now is
* a bunch of hosts behind a flaky vpn line.
* each host trigger depends on a host ping
* each host ping depends on the distant vpn router ping.
* pings update is 120s, host items 300s
In this context, when the vpn line goes down, I still get some triggers for hosts items or hosts pings, that I understand are the triggers fired just before zabbix gets the info that the vpn itself is down.
When setting up the system I had the same update time for pings and other items, I had almost half of the hosts triggers when the vpn had a hiccup. reducing the ping update reduced the false information.
Now, i would like to avoid those too...
So i tried using the escalation system : wait for 120s ( ~ 1 ping update) before sending an alert, hoping that I would avoid the dependencies calculation problem by making sure zabbix would know the router was down
But it seems that trigger dependency is calculated only at fire time and is not recalculated when the alert is sent... so I still get my triggers for hosts items even tho zabbix knows that the router it depends on is down...
Am I correct in my interpretation ? is it intended ?
btw, this is on zabbix 1.8.5
I'm trying to get around a somehow 'classic' problem : how to avoid or reduce false positives due to network topology ...
In order to limit the problem of 'router down' => 'whole network is down' => alert storm, I've setup dependencies.
What I have right now is
* a bunch of hosts behind a flaky vpn line.
* each host trigger depends on a host ping
* each host ping depends on the distant vpn router ping.
* pings update is 120s, host items 300s
In this context, when the vpn line goes down, I still get some triggers for hosts items or hosts pings, that I understand are the triggers fired just before zabbix gets the info that the vpn itself is down.
When setting up the system I had the same update time for pings and other items, I had almost half of the hosts triggers when the vpn had a hiccup. reducing the ping update reduced the false information.
Now, i would like to avoid those too...
So i tried using the escalation system : wait for 120s ( ~ 1 ping update) before sending an alert, hoping that I would avoid the dependencies calculation problem by making sure zabbix would know the router was down
But it seems that trigger dependency is calculated only at fire time and is not recalculated when the alert is sent... so I still get my triggers for hosts items even tho zabbix knows that the router it depends on is down...
Am I correct in my interpretation ? is it intended ?
btw, this is on zabbix 1.8.5
Comment