Ad Widget

Collapse

Actions: Faulty Escalation / Delayed Recovery

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nms_user
    Member
    • Feb 2009
    • 43

    #1

    Actions: Faulty Escalation / Delayed Recovery

    Hi all,

    We have periodic (daily) planned reboots on some servers and I realized a strange random behavior of actions:

    Day1:
    1) Host goes down (unreachable) --> (first step of the) action is executed as expected (which is a external script here)
    2) Host comes up again (.status AND trigger/event tells "OK") --> action escalation runs through all steps as defined in the action

    So internally the action thinks the server is down, but everything else tells me that the machine is up.

    Day2 (next reboot):
    3) Host goes down again (unreachable) --> (first step of the) action is executed as expected again (which is a external script as we know)
    4) Host comes up again (.status AND trigger/event tells "OK") --> action immediately releases unreachable and sends recovery message to all the recepients who got the 'wrong' escalation in step 2.

    Anybody who can confirm this?
    We are on v1.6.5...
    Last edited by nms_user; 15-07-2009, 11:57.
  • untergeek
    Senior Member
    Zabbix Certified Specialist
    • Jun 2009
    • 512

    #2
    I had the exact same thing happen today. Service goes down for scheduled reboot. Service comes up within 10 minutes. Zabbix shows that each item is running properly. Triggers are even negative. Actions continue to generate false positives.

    We are running 1.6.5 also.

    Comment

    • antoniogmyo
      Junior Member
      • Aug 2009
      • 9

      #3
      Same problem here (v1.6.6)

      Hi,

      We´re having the same problem. We´ve defined step scalations, 2-2, in our case, to send an email.

      As i understand how scalations works, that means that when the trigger is fired twice, it will excute the action, in this case send an email.

      But this is what´s hapenning:
      1 .- There is a problem with an item and the trigger is fired once. No action executed. That´s right.

      2 .- The item becomes right again, so no trigger should be fired, but the trigger seems to scale to step 2, an email is sent telling there´s a problem with the item. In the email content we reflect the last valor of the item and it shows that it is right.

      Are we missing the scalations porpose?. The idea behind this is that we don´t wanna flapping alerts sending emails everytime a trigger is fired, just when it is fired twice, at least.

      Thanks in advance.

      Comment

      • clubbing80s
        Senior Member
        • Sep 2005
        • 109

        #4
        I'v have similar. I continue to get notifications on the trigger that has recovered. I have tried a number of different configurations. V1.6.6

        Comment

        • antoniogmyo
          Junior Member
          • Aug 2009
          • 9

          #5
          Nothing new about this?

          Hi,

          Sorry for rise up this post, but it would be nice to have some suggestions about this issue.

          cya!

          Comment

          • richlv
            Senior Member
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Oct 2005
            • 3112

            #6
            ahh, yes, this is a very elusive and nasty problem.
            i'd encourage everybody to add their detail to https://support.zabbix.com/browse/ZBX-1308 so that it can be finally be dealt with (please make sure to include relevant detail - look at the already reported details - and try to distinguish data provided by one reporter so that correlating it is easier, thanks)
            Zabbix 3.0 Network Monitoring book

            Comment

            • NOB
              Senior Member
              Zabbix Certified Specialist
              • Mar 2007
              • 469

              #7
              Originally posted by antoniogmyo
              Hi,

              We´re having the same problem. We´ve defined step scalations, 2-2, in our case, to send an email.

              As i understand how scalations works, that means that when the trigger is fired twice, it will excute the action, in this case send an email.

              But this is what´s hapenning:
              1 .- There is a problem with an item and the trigger is fired once. No action executed. That´s right.

              2 .- The item becomes right again, so no trigger should be fired, but the trigger seems to scale to step 2, an email is sent telling there´s a problem with the item. In the email content we reflect the last valor of the item and it shows that it is right.

              Are we missing the scalations porpose?. The idea behind this is that we don´t wanna flapping alerts sending emails everytime a trigger is fired, just when it is fired twice, at least.

              Thanks in advance.
              Hi antoniogmyo

              if you just want to disable flapping triggers, just add the flapping detection
              to the trigger expression instead to the actions.

              ZABBIX has one of the best correlation engines around.
              See ZABBIX 1.6 rev. 17 manual page 125:

              Code:
              count(#10,12,”gt”)
              will return exact number of values which are more than ‘12’ stored in
              So if you want to check whether the last usage in percent of a filesystem
              is more than 90 for the last three checks you could use an expression like count(#3,90,"gt")=3 instead of the usual last(0)>90.
              The syntax of the complete expression will differ, but you'll get the idea.

              HTH,

              Norbert.

              Comment

              Working...