Ad Widget

Collapse

trigger alert after 10 different of "ok"

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Flagathor
    Junior Member
    • Jan 2024
    • 8

    #1

    trigger alert after 10 different of "ok"

    Hi,

    I want create a trigger who raise an laert if there is 10 times in a raw something different of "ok"

    so i create that:
    [check_aliveness].count(#10,ok,regexp)}=0
    or that:
    [check_aliveness].regexp(ok,#12)}=0​​​

    it's working except in 1 usecase: if the 1st value is not "ok".
    example:
    2024-01-19 09:43:46 ok
    2024-01-19 09:40:52 ok
    2024-01-19 09:37:29 ok
    2024-01-19 09:35:06 ok
    2024-01-19 09:32:01 failed
    It's like zabbix consider before the 1st value is the infinite so more than 10​ and so raise an alert.
    and i don't want this fake alert.

    have you an idea to help me please ?

  • Flagathor
    Junior Member
    • Jan 2024
    • 8

    #2
    up the message

    Comment

    • cyber
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Dec 2006
      • 4807

      #3
      What version of Zabbix? By trigger format, something older?

      count (/host/key,(sec|#num)<:time shift>,<operator>,<pattern>)​
      count(/host/check_aliveness​,#10,"like","ok")=0
      count of values during last 10 checks, where value is OK, is 0

      Comment

      • Flagathor
        Junior Member
        • Jan 2024
        • 8

        #4
        i use zabbix 3.4

        and my request is like this :
        {Template RabbitMQ:rabbitmq[check_aliveness].count(#10,ok,regexp)}=0

        i follow the instruction given on this website: 1 Supported trigger functions (zabbix.com)
        count (sec|#num,<pattern>,<operator>,<time_shift>)

        I tried with "like" instead of "regexp" but it's the same.

        when the 1st one and only the 1st one is not the pattern​​, zabbix seems to consider that before the 1st one is infinite so more than 10 and launch alert.
        and so i have a big number of false alert.
        it's really weird.

        Comment

        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4807

          #5
          Data handling may aswell be changed over time... 3.4 is so ancient... You should consider using something newer...

          Comment

          • ISiroshtan
            Senior Member
            • Nov 2019
            • 324

            #6
            Hey there.

            You just doing evaluation with reverse logic. Instead of counting how many "problem" messages there are you counting how many "ok" messages are in. And if there are non - raise alert.
            Now if you have only 1 value, and said value is "problem", how many "ok" value you have? None. 0. And if you have 0 "ok" values, what you ask Zabbix to do? To fire a trigger. And it does just that - fires the trigger

            Comment

            • Flagathor
              Junior Member
              • Jan 2024
              • 8

              #7
              Originally posted by ISiroshtan
              Hey there.

              You just doing evaluation with reverse logic. Instead of counting how many "problem" messages there are you counting how many "ok" messages are in. And if there are non - raise alert.
              Now if you have only 1 value, and said value is "problem", how many "ok" value you have? None. 0. And if you have 0 "ok" values, what you ask Zabbix to do? To fire a trigger. And it does just that - fires the trigger
              yes ok and how you write it with the count function ?

              i want raise an alert only if the "ok" is not present 10 times consecutively

              Comment

              • ISiroshtan
                Senior Member
                • Nov 2019
                • 324

                #8
                I honestly don't have any lab zabbix in hands reach atm, so I can not really test anything (not even talking about having Zabbix 3.*). But from the top of my head you can try one of the following:

                1. Actually count "not_ok" messages. Like:

                [check_aliveness].count(#10,ok,ne)}=10​
                2. Add a check to ensure you have 10 values in history to do evaluation (based on your collection interval, if data collected every minute then set it to 10+m), something like:

                [check_aliveness].count(#10,ok,regexp)}=0​ and [check_aliveness]​.count(11m)>=10


                Comment

                • Flagathor
                  Junior Member
                  • Jan 2024
                  • 8

                  #9
                  [check_aliveness].count(#10,ok,ne)}=10​ --> i have already test it. it change nothing to my initial issue regarding the 1st value. it consider that infinite is more than 10 so it raise an alert even if there is only 1 "failed"
                  and [check_aliveness]​.count(11m)>=10 --> i will test ith this condition, i'll let you know.

                  Comment


                  • ISiroshtan
                    ISiroshtan commented
                    Editing a comment
                    for [check_aliveness].count(#10,ok,ne)}=10:
                    You sure you set "ne" as function operator and =10 as result of evaluation?

                    Also you sure you gave Zabbix enough time to update configuration cache before starting sending data?

                  • Flagathor
                    Flagathor commented
                    Editing a comment
                    i will try again ;-)

                  • Flagathor
                    Flagathor commented
                    Editing a comment
                    so it consider that the 1st one is the 10th and raise an alert. else after that it works if there was an ok.
                    but if the 1st one is not a "ok" zabbix raise an alert
                • ISiroshtan
                  Senior Member
                  • Nov 2019
                  • 324

                  #10
                  so it consider that the 1st one is the 10th and raise an alert. else after that it works if there was an ok.
                  but if the 1st one is not a "ok" zabbix raise an alert



                  eeeeehm... no. when you say count(#10) it takes last 10 values. If there are 10 values. If there are less -> it should just take as many as there are.
                  Next it applies a filter to them values based on your rules. If we say count(#10,ok,ne) it should compare each of 10 (or however many there are) values against our pattern(ok), and say how many of the values are not exact matching (ne) and give the number representing it. If we say =10, it can not fire because it thinks a single value is infinity or 10th or whatever as we ask it to fire only when we have exactly 10 as result of a whole count operation.

                  Can you add screenshots of your configuration? Can you also check what is the value of CacheUpdateFrequency in your zabbix_server.conf ?

                  Comment

                  • Flagathor
                    Junior Member
                    • Jan 2024
                    • 8

                    #11
                    I tested [check_aliveness].count(#10,ok,ne)}=10: during 2 days and it works
                    thanks a lot ;-)

                    Comment

                    Working...