Ad Widget

Collapse

Retry check interval

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ralf
    Junior Member
    • Jun 2009
    • 8

    #1

    Retry check interval

    Is there such a concept in zabbix as a 're-check interval'? What I mean is this. You set up a zabbix check to take place once every 10 minutes. You don't want zabbix to trigger a notification on the very first failed check. Instead, you want zabbix to re-run the check a set number of retries, at a much shorter interval than the default 10 minutes interval; let's say, three retries, one every 60 seconds. If after the three re-checks, the service still returns a non-OK status, then by all means, trigger the notification alarms. The way checks work right now, if I understand them correctly is, If the check returns a non-OK status, zabbix will wait the length of the default interval to run the check again. I've used the .count(), .max(), and .avg() functions. But all of them seem to be dependent on the length of the default interval.

    Looking forward to any replies that offer any help on this.

    Thanks;

    Al.
  • zabbix_zen
    Senior Member
    • Jul 2009
    • 426

    #2
    So far, the collected values refresh interval is static. Thus, if a non-OK value is returned, you have to wait another X sec to recheck it.

    "You don't want zabbix to trigger a notification on the very first failed check"
    You can do this already. For example if you want to be alerted if a metric is below 98 for 2 consecutive periods:{HOST_NAME:ITEM_NAME.last(0)}<98 {HOST_NAME:ITEM_NAME.prev(0)}<98

    or use variations to fit your needs like:
    last(#N)
    sum(#N)
    sum(num)
    avg(num)
    ...
    where N is the Nth collected value and num the number of seconds in the past (and thus you'd be alerted not on the first occurrence but in a period predefined by you)

    Comment

    • ralf
      Junior Member
      • Jun 2009
      • 8

      #3
      Zabbix_zen;

      Thank you for your prompt reply. I am familiar with the functions you reference. However; they do not address the issue I raised. I already know how not to trigger notifications on the first failed check. The problem with the approach you mention, and probably the only one available, is that, you are forced to wait until the next scheduled check takes place to get another reading of the metric/service you're monitoring. If zabbix just happened to have checked the service when there was some latency on the network and the service was really up, zabbix would've gotten a non-OK status. And if the check interval was set to, let's say 10 minutes, you would have to wait another 10 minutes to find out whether or not the service was actually in trouble. Whereas, if you could tell zabbix to set an alternate check interval, let's say every minute, and a maximum number of times to recheck the service, you wouldn't have to wait 10 minutes to recheck the service. This way you could probably have a more accurate view of how the service is behaving.

      Thank you again;

      Al.

      Comment

      • zabbix_zen
        Senior Member
        • Jul 2009
        • 426

        #4
        AFIK that's the standard and only way.

        If you think it's an important development to Zabbix,
        first search here if anyone already opened something similar
        and open a new one if that's not the case.

        Comment

        • mattsmith
          Member
          Zabbix Certified Specialist
          • Aug 2010
          • 33

          #5
          Hi All,

          Did this ever get resolved, I would like to see the same feature?

          Matt

          Comment

          • nima0102
            Senior Member
            • May 2010
            • 106

            #6
            Hi
            I thinks this feature is useful.
            I will be happy if this feature is added to next releases.

            Thanks in advance

            Comment

            • walterheck
              Senior Member
              • Jul 2009
              • 153

              #7
              MEntioning it in the forums is not a proper way fo getting this added at any point in the future. Please add it in the support tracker and place a link here, that will be much more effective.
              Free and Open Source Zabbix Templates Repository | Hosted Zabbix @ Tribily (http://tribily.com)

              Comment

              • nima0102
                Senior Member
                • May 2010
                • 106

                #8
                Originally posted by ralf
                Is there such a concept in zabbix as a 're-check interval'? What I mean is this. You set up a zabbix check to take place once every 10 minutes. You don't want zabbix to trigger a notification on the very first failed check. Instead, you want zabbix to re-run the check a set number of retries, at a much shorter interval than the default 10 minutes interval; let's say, three retries, one every 60 seconds. If after the three re-checks, the service still returns a non-OK status, then by all means, trigger the notification alarms. The way checks work right now, if I understand them correctly is, If the check returns a non-OK status, zabbix will wait the length of the default interval to run the check again. I've used the .count(), .max(), and .avg() functions. But all of them seem to be dependent on the length of the default interval.

                Looking forward to any replies that offer any help on this.

                Thanks;

                Al.
                Dear ralf
                Because you start this topic so it's better yourself create same topic in support tracker.

                Thanks in advance

                Comment

                • f.koch
                  Member
                  Zabbix Certified Specialist
                  • Feb 2010
                  • 85

                  #9
                  i am not the author of this thread, but here is the link



                  regards flo

                  Comment

                  • vienna
                    Junior Member
                    • Mar 2013
                    • 7

                    #10
                    Hi all.

                    since the support tracker is unresolved and there is no action on it, does anybody know whether there is some action on this item or not?

                    We really want to recheck immediately after we got an error to be sure reporting a real problem.

                    Kind regards
                    Last edited by vienna; 25-03-2013, 10:49.

                    Comment

                    • daclarke
                      Junior Member
                      • Nov 2014
                      • 1

                      #11
                      I stumbled upon this thread looking for the same answer to the question everyone else was asking.

                      I've created a workaround that others might find useful. In the zabbix mysql database we added a MySQL trigger on the zabbix "triggers" table that when there is a change in the "value" column, it will update the corresponding "items" table and change the delay value. The link between "triggers" and "items" are through the "functions" table. We created a new table to store two delay values per item, an active time delay and the default time delay.

                      Hope this helps!

                      Comment

                      Working...