Ad Widget

Collapse

Stopping false alarms for simple check

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • predatorz
    Senior Member
    • Mar 2007
    • 109

    #1

    Stopping false alarms for simple check

    Hi,

    I am using the following expression for simple check.
    {Template:icmpping.sum(300)}=0
    {Template:http_perf.sum(300)}=0
    {Template:smtp_perf.sum(300)}=0 and other simple check(ssh_perf, tcp, http).

    i have tried many expressions hoping to stop the false alarms but to no avail
    Ping interval=60s
    {Template:icmpping.max(#5)}#1
    {Template:icmpping.max(#5)}<1 | {Template:icmpping.max(#5)}>1
    and others

    The trigger works well to inform us(Email) when service is down (Trigger becomes true, when data collected equal 0 for 300s). However, when zabbix get back value of 2(timeout from the check), it will set trigger back to false(But service is still down), inform us again(Email). after 5 mins, zabbix get data equal 0, it inform us again. The process repeats over and over. Resulting in a lot of false alerts and emails.

    Anyway that i can change the trigger expression to keep false alerts to minimal or stop it? How does zabbix determine time out? Anywhere in zabbix to modify?

    Hope to find a solution.
  • predatorz
    Senior Member
    • Mar 2007
    • 109

    #2
    Anyone can help?
    My brain is drained.

    Comment

    • predatorz
      Senior Member
      • Mar 2007
      • 109

      #3
      Bump.. Help

      Comment

      • predatorz
        Senior Member
        • Mar 2007
        • 109

        #4
        Any hints/help???

        Comment

        • predatorz
          Senior Member
          • Mar 2007
          • 109

          #5
          Does anyone knows whether simple check functions like pop_perf, smtp_perf, http_perf, ssh_perf and others return value of 2 when timeout?

          Comment

          • Alexei
            Founder, CEO
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Sep 2004
            • 5654

            #6
            It seems it is not possible to create a trigger which would be true if all values are either 0 or 2 for a period of time currently. A new function, valcount() or count(period,value) will be introduced soon (in 1.3.x, pre 1.4) in order to address this issue.
            Alexei Vladishev
            Creator of Zabbix, Product manager
            New York | Tokyo | Riga
            My Twitter

            Comment

            • predatorz
              Senior Member
              • Mar 2007
              • 109

              #7
              Thanks for the information, but does the new function help to reduce false alerts?
              Last edited by predatorz; 16-04-2007, 12:11.

              Comment

              • predatorz
                Senior Member
                • Mar 2007
                • 109

                #8
                Using {template:icmpping.sum(#5)}=0 and other similar simple check, problem is after it is triggered, even a value of 2(timeout) will set trigger back to false even though service still down. Any more modifications i can add to the expression to reduce false alerts?

                Comment

                • predatorz
                  Senior Member
                  • Mar 2007
                  • 109

                  #9
                  Anyone care to shed some light ? Help...

                  Does those perf items return value of 2 when timeout like pop,smtp,http and others?

                  Comment

                  • predatorz
                    Senior Member
                    • Mar 2007
                    • 109

                    #10
                    Will it work if a trigger is written to check for host up and it depends on the trigger that check if the host go down?

                    I doubt it will work. LOL. Most probably, the first trigger will not be able to reset back to false if i write it like above.

                    True?

                    Comment

                    • predatorz
                      Senior Member
                      • Mar 2007
                      • 109

                      #11
                      Originally posted by predatorz
                      Will it work if a trigger is written to check for host up and it depends on the trigger that check if the host go down?

                      I doubt it will work. LOL. Most probably, the first trigger will not be able to reset back to false if i write it like above.

                      True?
                      AM i correct in this ?
                      Seems like for those *_perf items, if it the item timeout, value returned is >5, is this correct?

                      Comment

                      • Alexei
                        Founder, CEO
                        Zabbix Certified Trainer
                        Zabbix Certified SpecialistZabbix Certified Professional
                        • Sep 2004
                        • 5654

                        #12
                        FYI New function, count(period<,value>), has been implemented. It is already in the latest code. It works with items of any type.
                        Alexei Vladishev
                        Creator of Zabbix, Product manager
                        New York | Tokyo | Riga
                        My Twitter

                        Comment

                        • predatorz
                          Senior Member
                          • Mar 2007
                          • 109

                          #13
                          Yup, saw that in the latest manual.

                          Comment

                          • Alexei
                            Founder, CEO
                            Zabbix Certified Trainer
                            Zabbix Certified SpecialistZabbix Certified Professional
                            • Sep 2004
                            • 5654

                            #14
                            Originally posted by predatorz
                            Yup, saw that in the latest manual.
                            I don't think so. The latest manual is on my laptop Function 'count' existed since 1.0, now we just added second parameter.
                            Alexei Vladishev
                            Creator of Zabbix, Product manager
                            New York | Tokyo | Riga
                            My Twitter

                            Comment

                            • rodneyalamarboys
                              Junior Member
                              • Apr 2009
                              • 1

                              #15
                              hello

                              I have the same problem in my lan because the time out, I don't know how resolve this problem to false alerts

                              Comment

                              Working...