Ad Widget

Collapse

Trigger with dependency fires after parent trigger recovers

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • aeb
    Junior Member
    • Feb 2026
    • 3

    #1

    Trigger with dependency fires after parent trigger recovers

    Hi,

    I am trying to to debug an issue where an ICMP down trigger with a dependency of another hosts ICMP down trigger fires in the moment the parent trigger recovers, even if the condition of the dependent trigger is not met for a couple of iterations.
    I've also tried to implement the solution from here: https://www.zabbix.com/forum/zabbix-...886#post481886
    But that just makes it more weird.

    Given this trigger action:
    Code:
    Details                                                           Start in      Duration
    Send message to user groups: Zabbix administrators via all media  00:01:30      Default
    Parent device "switch" has a trigger with problem and recovery expression:
    Code:
    Problem: max(/switch/icmpping,#3)=0
    Recovery: min(/switch/icmpping,#3)=1
    Child device "server" has a trigger with only a problem expression and is depending on the switches ICMP down trigger:
    Code:
    Problem: max(/server/icmpping,#3)=0
    Now the switch goes down and Zabbix fires a trigger action for the problem and also for the recovery:
    Code:
    Parent device "switch":
    2026-02-22 01:53:36 PM  Up (1)
    2026-02-22 01:54:36 PM  Up (1)
    2026-02-22 01:55:36 PM  Down (0)
    2026-02-22 01:56:36 PM  Down (0)
    2026-02-22 01:57:36 PM  Down (0)
    2026-02-22 01:58:36 PM  Down (0)
    2026-02-22 01:59:36 PM  Down (0) -> Problem Trigger Action fires: "Problem started at 01:57:36"
    2026-02-22 02:00:36 PM  Down (0)
    2026-02-22 02:01:36 PM  Up (1)
    2026-02-22 02:02:36 PM  Up (1)
    2026-02-22 02:03:36 PM  Up (1)  -> Recovery Trigger Action fires: "Problem has been resolved at 02:03:36"
    2026-02-22 02:04:36 PM  Up (1)
    2026-02-22 02:05:36 PM  Up (1)​
    A problem trigger action is not fired for the dependent server. This is what I want and until here everything is how I expected it to be.
    But after the parent device recovers, the problem trigger does fire for the child device, only to recover shortly afterwards, even though it has been up again for a few check iterations:

    Code:
    Child device "server"
    2026-02-22 01:53:27 PM Up (1)
    2026-02-22 01:54:27 PM Up (1)
    2026-02-22 01:55:27 PM Down (0)
    2026-02-22 01:56:27 PM Down (0)
    2026-02-22 01:57:27 PM Down (0)
    2026-02-22 01:58:27 PM Down (0)
    2026-02-22 01:59:27 PM Down (0)
    2026-02-22 02:00:27 PM Down (0)
    2026-02-22 02:01:27 PM Up (1)
    2026-02-22 02:02:27 PM Up (1)
    2026-02-22 02:03:28 PM Up (1) -> Problem Trigger Action fires: "Problem started at 01:57:27"
    2026-02-22 02:04:27 PM Up (1) -> Recovery Trigger Action fires: "Problem has been resolved at 02:04:27"
    2026-02-22 02:05:27 PM Up (1)
    ​
    Edit: I'm on Zabbix 7.4.2.
    Does anyone have an idea why that's happening?

    Thank you and all the best,
    aeb
    Last edited by aeb; 22-02-2026, 18:44.
  • kyus
    Senior Member
    • Feb 2024
    • 187

    #2
    Your trigger on "server" fired before the trigger on "switch". The way dependencies work is: If trigger A depends on trigger B, when trigger B is in PROBLEM state, trigger A won't be evaluated nor will any actions be executed for the dependent trigger (A).

    So in "server" Problem started at 01:57:27. Then in "switch" Problem started at 01:57:36.
    Since your action has a 01:30 delay to send notifications, the dependency of "server" made it not start the action.
    After your "switch" trigger was back in "OK" state at 02:03:36, the action from the "server" alert was evaluated and the notification was sent, and right after that, the trigger itself was evaluated when the "server" ICMP Ping item got a new value at 02:04:27 and the issue was resolved.

    P.S.: I feel like this text is a bit confusing, but I hope it helps, anyways, for more detailed information you can check: https://www.zabbix.com/documentation...s/dependencies

    Comment

    • aeb
      Junior Member
      • Feb 2026
      • 3

      #3
      Hi kyus,

      thank you for your answer.
      So the trigger for server fired at 01:57:27, then Zabbix started the notification action, which has a delay of 01:30. But after that delay passed, no massage was sent. From my understanding Zabbix would evaluate the trigger again after that delay of 01:30 to see if the state is still "problem". Since it did not send the message, the trigger state was not problem anymore, due to the dependency. Is that correct so far?
      Or did Zabbix put the notification action on some kind of "hold" instead of discarding it after 01:30?

      All the best,
      aeb

      Comment

      • kyus
        Senior Member
        • Feb 2024
        • 187

        #4
        Originally posted by aeb
        Since it did not send the message, the trigger state was not problem anymore, due to the dependency.
        In fact, the trigger was still in problem state, but because the trigger on which it depends (switch) was also in problem state no actions were executed.

        A quick example in the docs page I linked:
        If both the server and the router are down and dependency is there, Zabbix will not execute actions for the dependent trigger.
        So yes, I guess it's some kind of "on hold".

        If you want to, you can increase the amount of checks needed for your server trigger to fire, let's say:​
        Code:
        max(/server/icmpping,#4)=0
        This should prevent this scenario.

        Comment

        • aeb
          Junior Member
          • Feb 2026
          • 3

          #5
          Hi again,

          I think I begin to understand what's happening.
          The server trigger triggers about 10 seconds before the switch trigger -> goes to state "problem". The corresponding actions get executed, but has a delay of 90 seconds, so it doesn't send the notification message.
          Then the switch trigger triggers and "invalidates" the server trigger which makes Zabbix not send the notification message even after the 90s passed.
          The crucial point is that, while the server trigger has a parent trigger in "problem" state, Zabbix doesn't update the dependent trigger at all, as per the docs page you linked:
          In all of the cases mentioned above, the dependent trigger (server) will be re-evaluated only when a new metric for it is received. This means that the dependent trigger may not be updated immediately.
          Then the switch comes back up, Zabbix looks at it's action queue, finds the alert message for the server, and because it did not receive an new metric for the server in that few seconds, the server trigger still is in "problem" and the message is sent.

          I would much rather like it if Zabbix would either reevaluate the trigger right before attempting to send the message or, if the parent trigger gets back to "ok", reevaluates the corresponding child triggers immediately.

          When I remove the event suppression in the code it actually works like I expected it. But of cause then there are a lot of unnecessary events generated (e.g. sounds in the web UI):
          Code:
          --- a/src/zabbix_server/events/events.c    2026-02-24 22:54:38.806716541 +0100
          +++ b/src/zabbix_server/events/events.c    2026-02-24 22:54:49.358835568 +0100
          @@ -2830,9 +2830,9 @@
                   if (FAIL == (event_check_dependency(event, &deps, trigger_diff)))
                   {
                       /* reset event data/trigger changeset if dependency check failed */
          -            event->flags = ZBX_FLAGS_DB_EVENT_UNSET;
          +            //event->flags = ZBX_FLAGS_DB_EVENT_UNSET;
                       diff->flags = ZBX_FLAGS_TRIGGER_DIFF_UNSET;
          -            continue;
          +            //continue;
                   }
           
                   if (TRIGGER_VALUE_PROBLEM == event->value)
          So I will not keep it like that.

          As workarounds I see two options currently.
          1. increase the amount of checks for the child device, like you mentioned
          2. increase the update interval for the parent device
          Or are there maybe other options?

          All the best,
          aeb

          Comment

          Working...