Ad Widget

Collapse

Make Web Check Triggers Less Sensitive

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • kahlis
    Junior Member
    • Jul 2015
    • 4

    #1

    Make Web Check Triggers Less Sensitive

    Hey Everyone,

    I've been fighting with how to make our web checks a little less sensitive.

    We've been having issues with flapping on a 200 status check (Timeout, HTTP Status 0). Even if I see the alert hit on the web interface, I can go straight to the web and there's no issues. Nothing reported on logs on the web server or Zabbix. It goes back to normal once the next check comes in. Timeout was 15 sec, I've increased it to 30 sec.

    I'd like to write a trigger that says: "If the last 3 checks are all not 200, then trigger"

    I've been using a simple trigger like this previously:

    {[SERVERNAME]:web.test.rspcode[(WEBNAME)].last()}<>200
  • eav
    Junior Member
    • Jul 2015
    • 4

    #2
    Hi,

    I think this should work:
    ({[SERVERNAME]:web.test.rspcode[(WEBNAME)].delta(#3)}=0) and ({[SERVERNAME]:web.test.rspcode[(WEBNAME)].last()}<>200)

    I didn't get around to testing it though.

    Comment

    • kahlis
      Junior Member
      • Jul 2015
      • 4

      #3
      Thanks for the reply! Can you explain what the .delta(#3) does? I looked it up on the documentation and I don't think I get it.

      I've made the trigger much less sensitive by using this for now:

      ({[SERVERNAME]:web.test.rspcode[(WEBNAME)].last(#1)}<>200) and ({[SERVERNAME]:web.test.rspcode[(WEBNAME)].last(#2)}<>200)

      Comment

      • eav
        Junior Member
        • Jul 2015
        • 4

        #4
        It calculates the difference between the minimum and maximum value in the evaluated period, in this case the last three checks.

        So if all your last checks came back as "403" the delta would be 0, if they were "200" "403" "200" the delta would be 203.

        This together with the last<>200 check will trigger the alarm only when you get the same error code, other than 200, for 3 time in a row.

        If you decide to give it a try you will have to keep in mind that if the page you're monitoring is flapping and you never get the same error code for three times in row the trigger won't fire, even if the status has not been 200 in a while.

        Also, both with your current definition and using delta, the trigger would recover as soon as you get a "200" reply, even if it was a once off, if may want to have a look at this: http://blog.zabbix.com/no-more-flapp...mart-way/1488/ to decrease the recovery sensitivity too

        You may

        Comment

        Working...