In What's Up Gold, there's a concept of a "critical" monitor (= item + trigger).
For example, if you are monitoring a bunch of services, you can set ping to be criticial, and if ping is down, none of the others are tested, and none of the others are reported.
That's almost the same as a dependency in a trigger in zabbix, but not quite, and I'm wondering if there is a way to implement it.
As an example, for a zabbix agent monitored device, the default template includes a "Zabbix agent is unreachable for 5 minutes" trigger. I set a dependency on it to ICMP ping, so if the device cannot be pinged, it doesn't trigger an error on the agent.
If the ping is down for a while, some of the time when I power it up, the ICMP comes up before the zabbix agent is polled, and the result is that the previous status of "down 5 minutes" for the agent is reported, before the poll of the agent can register it as being up. So I get a "problem" followed immediately by a "OK".
Now yes, for agent perhaps I should not use ping, but this seems to apply to all sorts of other triggers -- I don't want to get 5 messages that 5 services are down if the host won't ping, I just want to know the host is offline (a presumption for the no-ping).
Note that this particular case both checks are on the same 60 second interval, but some services I might want to check only ever 5 minutes, or 60 minutes -- in those cases the probability goes up of the ICMP status clearing well before these services are checked, and then the notification going out.
Is there a proper way to do this? I was busy setting up ICMP dependencies, when I ran into this race condition.
It really seems like what you want is for the dependent trigger not to fire until the next poll interval for that item?
Maybe there's a much simpler approach for all this -- is there?
Fundamental desire -- if I can't ping it, stop notifications for other services from going out, period.
For example, if you are monitoring a bunch of services, you can set ping to be criticial, and if ping is down, none of the others are tested, and none of the others are reported.
That's almost the same as a dependency in a trigger in zabbix, but not quite, and I'm wondering if there is a way to implement it.
As an example, for a zabbix agent monitored device, the default template includes a "Zabbix agent is unreachable for 5 minutes" trigger. I set a dependency on it to ICMP ping, so if the device cannot be pinged, it doesn't trigger an error on the agent.
If the ping is down for a while, some of the time when I power it up, the ICMP comes up before the zabbix agent is polled, and the result is that the previous status of "down 5 minutes" for the agent is reported, before the poll of the agent can register it as being up. So I get a "problem" followed immediately by a "OK".
Now yes, for agent perhaps I should not use ping, but this seems to apply to all sorts of other triggers -- I don't want to get 5 messages that 5 services are down if the host won't ping, I just want to know the host is offline (a presumption for the no-ping).
Note that this particular case both checks are on the same 60 second interval, but some services I might want to check only ever 5 minutes, or 60 minutes -- in those cases the probability goes up of the ICMP status clearing well before these services are checked, and then the notification going out.
Is there a proper way to do this? I was busy setting up ICMP dependencies, when I ran into this race condition.
It really seems like what you want is for the dependent trigger not to fire until the next poll interval for that item?
Maybe there's a much simpler approach for all this -- is there?
Fundamental desire -- if I can't ping it, stop notifications for other services from going out, period.