Ad Widget

Collapse

Host status trigger with time delay?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mauibay
    Junior Member
    • Jan 2008
    • 23

    #1

    Host status trigger with time delay?

    I'm fuzzy on why this trigger doesn't seem to work (v1.4.4):

    {Template_Windows:status.min(180)}=2

    My intent is for the trigger to only be "ON" when the status for the last 3 minutes is 2, in other words, when the host is unreachable for at least 3 minutes.

    Is the special nature of status the issue here? If this is the wrong approach, what is the recommended method to achieve the result I want?

    Thanks for any advice!
  • xs-
    Senior Member
    Zabbix Certified Specialist
    • Dec 2007
    • 393

    #2
    you want max()

    Comment

    • mauibay
      Junior Member
      • Jan 2008
      • 23

      #3
      Can you explain how I would want to use max() ?

      If the status value has been 2 for the last 180 seconds, this trigger should be true:

      {Template_Windows:status.min(180)}=2

      If I used max() wouldn't the trigger be true even if there was a status value lower than 2 in the last 180 seconds?

      I'm trying to prevent false positives by not triggering immediately. Since the status value is 2 when the host is unreachable, I thought it would make sense to only trigger if status hasn't been other than 2 for three minutes.

      I originally tried using max() to do the same thing inversing the logic:

      {Template_Windows:status.max(180)}=0

      But it seems the status value returns "nodata" when the host is up, so this doesn't work. The only time status seems to return a value is when the host is unreachable, so I inversed the logic and used min().

      If I did this:

      {Template_Windows:status.max(180)}=2

      The trigger should be true as long as the value was 2 at least once in the last 180 seconds, even if it wasn't for the entire time, which is not what I want. I only want the trigger to be "on" if the status was 2 for the entire 180 seconds.

      Am I completely missing some point here? If so, what is it? I feel like I'm overcomplicating it somehow.

      Comment

      • xs-
        Senior Member
        Zabbix Certified Specialist
        • Dec 2007
        • 393

        #4
        Hmm, my bad, didnt read your post correctly, just the host up/down part.
        I use ping for host up/down, 1 is up 0 is down.
        max(180)=0 for host down trigger, in my case.


        Your case would be min(180)=2
        This would mean that over a period of 3 minutes, the lowest value must at least be 2 in order for the value to be true. Keep in mind tho that this will not detect a flapping service!
        I.e. if you ping once per 30 secs (6 pings in 3 mins), then 5 fails and one correct ping will not set off the trigger.

        Comment

        • mauibay
          Junior Member
          • Jan 2008
          • 23

          #5
          Thanks, I was thinking I was missing something obvious!

          I know this won't detect a flapping trigger, but that's really the whole point in this case. Since the host status isn't actually a direct test of any service it's exactly what I want. I have many other triggers for various services in the host with lower severity, and I only want the high severity status trigger to fire if it's a sure thing.

          Basically, in the past year I've seen some network service triggers flap due to network issues cause the host status to trigger for one test only when in fact the host was never unreachable. I recently started adding some smartphone recipients to receive actions and am working on bulletproofing the disaster-severity triggers.

          The status value for unreachable hosts is the only one that's given me any issues, probably because it's different from every other value I use, being a "special" generated value and not an actual test result.

          I need to learn more about that, I think it's the key to understanding what I'm seeing now, which is that my original trigger using min(180)=2 now seems to be working. I had no data for two days and my attempts to force a trigger change were ineffective. I wish I knew of a way to force a "trigger reset" of some kind, since it might prevent my confusion in thinking the trigger did not work. I'm not totally convinced it is working, but after a state change to 2 then back to nodata the trigger display properly in the overview and the trigger then functions as I expected.

          Today I had 3 hosts go unreachable today for a short period then come back online, and now those 3 triggers display green in the overview and the action worked when one of them went unreachable for a second time. So I'm thinking now that I was wrong and my trigger is defined correctly but that the special nature of the host value causes confusion until the next actual state change.

          Thanks for your reply though, it made me rethink some assumptions so I could figure out what I need to understand better.

          Comment

          Working...