Ad Widget

Collapse

Alert on 2 or 3 failed attempts on Web Scenario

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • devinacosta
    Junior Member
    • Mar 2016
    • 6

    #1

    Alert on 2 or 3 failed attempts on Web Scenario

    I am running the latest Zabbix, and I am trying to figure out what is the best trigger to use so that it alerts me only if a web check failed like 2 or 3 times in a row. I am seeing that for some reason some sites fail 1st attempt but then the 2nd check succeeds. So i'm getting tons of false alerts.

    Would a trigger like this be considered what i should use?
    {Zabbix server:web.test.fail[domain.com].min(#2)}>0

    Also I know in the past it was suggested to do stuff like TRIGGER.VALUE=0 xxxx or TRIGGER.VALUE=1, do you really need to do that?
  • guzzijason
    Senior Member
    • Dec 2015
    • 106

    #2
    Originally posted by devinacosta
    I am running the latest Zabbix, and I am trying to figure out what is the best trigger to use so that it alerts me only if a web check failed like 2 or 3 times in a row. I am seeing that for some reason some sites fail 1st attempt but then the 2nd check succeeds. So i'm getting tons of false alerts.

    Would a trigger like this be considered what i should use?
    {Zabbix server:web.test.fail[domain.com].min(#2)}>0
    I think that should work - checks the last 2 values, and if the lowest value is greater than 0, then fire. Seems reasonable.

    Originally posted by devinacosta
    Also I know in the past it was suggested to do stuff like TRIGGER.VALUE=0 xxxx or TRIGGER.VALUE=1, do you really need to do that?
    This is what they refer to as "hysterisis" (https://www.zabbix.com/documentation...ion#hysteresis). This is useful if you have a check that is always flapping up and down. To use your example, you could do something like this:


    ({TRIGGER.VALUE}=0 and {Zabbix server:web.test.fail[domain.com].count(#10,0,"gt")}>5) or
    ({TRIGGER.VALUE}=1 and {Zabbix server:web.test.fail[domain.com].count(#10,0,"gt")}>0)


    The intent here is that the if the trigger is not currently active AND more than 5 of the last 10 checks has a value greater than 0, then activate the trigger. OR, if the trigger is already active AND more than 0 of the last 10 checks have a value of more than 0, then keep the trigger active.

    To put that another way, this expression would activate the trigger if 5 of the last 10 checks failed, but it would wait to reset the trigger until there are NO failures over the last 10 checks. This would help prevent excessive notifications if your number of failures over this period keep fluctuating around the initial threshold.

    __Jason

    Comment

    Working...