Ad Widget

Collapse

Only alert if 2 ouf ot 3 tests result in failure?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • geoffncm
    Junior Member
    • Apr 2008
    • 4

    #1

    Only alert if 2 ouf ot 3 tests result in failure?

    We've been using Zabbix for quite some time to monitor 30+ servers. I'd like to figure out how to make Zabbix only alert if two out of three tests result in a failure mode.

    For example, we perform a PING against various servers. If out of three attempts to ping the server, two fail, then alarm. Else, ignore the error condition.

    This should help eliminate erroneous alarms at 0300 hours.

    Thanks!
  • geoffncm
    Junior Member
    • Apr 2008
    • 4

    #2
    We're still struggling with this. Does anyone have any suggestions on how to do this? Thanks!!!

    Comment

    • antani
      Member
      • Apr 2008
      • 50

      #3
      Talking in boolean algebra it would be like:
      (A || B) && (A||C)

      Try:
      (({HostA:agent.ping.nodata(300)}=1) | ({HostB:agent.ping.nodata(300)}=1 ))&(({HostA:agent.ping.nodata(300)}=1) | ({HostC:agent.ping.nodata(300)}=1 ))

      Comment

      • geoffncm
        Junior Member
        • Apr 2008
        • 4

        #4
        That would work if we were talking about multiple servers. But what we're trying to do is alert if 2 out of 3 pings to a single server have failed.

        For instance, if we ping HostA once every 60 seconds, then we want an alert triggered if we get less than 2 successes in the last 180 seconds.

        I understand the boolean algebra, but I don't know if the function we're looking for is even within Zabbix's capabilities. Perhaps there's a different way to go about this that we're completely overlooking. Any ideas?

        Thanks!

        Comment

        • scalft
          Junior Member
          • Apr 2008
          • 12

          #5
          Try using the avg() function

          We had this problem with several alerts initially. Try something like this.

          {HostA:imcpping.avg(180)}>0.6

          That should give you an average of the last 3 polls and if any 2 of them are 1 and not 0, it will trigger.

          Comment

          • geoffncm
            Junior Member
            • Apr 2008
            • 4

            #6
            Thanks! We played around with your expression and found it would work perfectly, except that the > needed to be changed to < because of the return values for Succeed and Fail. Here is the exact expression we used (with our pings occuring every 60 seconds):

            {Template_Linux:Icmpping.avg(180)}<0.5

            Our erroneous alarms have effectively been silenced.

            Comment

            • scalft
              Junior Member
              • Apr 2008
              • 12

              #7
              Excellent. I am glad it is working for you.

              Comment

              Working...