Ad Widget

Collapse

Availability trigger for Active Checks

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • safl
    Senior Member
    • May 2005
    • 103

    #1

    Availability trigger for Active Checks

    Hello!

    I have all my hosts running with active checks.
    Problem is, with active checks the item 'status' is not supported. And pinging the host is not an option either.
    So do you poeple have any idea how to monitor server availability with active checks?
    I've tried adding a trigger with the 'nodata' expression but the trigger stop evaluating as soon as the data stops coming in so it simply switches to UNKNOWN instead of "going off".

    All help is highly appreciated!

    regards,

    Simon
  • James Wells
    Senior Member
    • Jun 2005
    • 664

    #2
    Greetings,

    When dealing with active checks only, you create various triggers that alert when values don't change, when they should or when there are no updates, when there should be. For example, assume you wanted to alert if an active check server failed to send in it's data within a specified time frame.

    First, you would create a user defined check, say;
    Code:
    UserParameter=active[timestamp],/bin/date +%s
    Now, the returned value will change every second, so if you set your delay to say 30, then this value should never be the same across two passes. From there, you would create a trigger similar to the following;
    Code:
    ({_Template_Linux_SVR:active[timestamp].last(0)}={_Template_Linux_SVR:active[timestamp].prev(0)})|{_Template_Linux_SVR:active[timestamp].nodata(35)}
    What this means is that if the current returned value is the same as the previously returned value, then something is wrong. Additionally, if there is no recent data, it will alert as well. You want to be careful here that your active check periodicity is shorter than your server check or else you will get a lot of false positives.
    Unofficial Zabbix Developer

    Comment

    • safl
      Senior Member
      • May 2005
      • 103

      #3
      I thank you very much for your suggestion!

      my big problem was that active checks wasn't evaluated when zabbix wasn't receiving any data. So no matter how i checked it, it would never trigger.
      But in the latest alpha12 this has changed.

      I now use a nodata(300) check on a cpu item. Then i do not need to add anything to the agents conf-file and since all hosts has a cpu the check will always be useable, regardless of platform.

      If anybody has come up with better ways to check availability of hosts that only utilize active checks, please share you're methods!

      Comment

      • James Wells
        Senior Member
        • Jun 2005
        • 664

        #4
        Originally posted by safl
        I now use a nodata(300) check on a cpu item. Then i do not need to add anything to the agents conf-file and since all hosts has a cpu the check will always be useable, regardless of platform.
        LOL!!! Tell that to my zabbix_agentd. For some strange reason none of my systems are returning CPU data... All of them return ZBX_NOTSUPPORTED. LOL!!! I am beginning to get a complex.
        Unofficial Zabbix Developer

        Comment

        • safl
          Senior Member
          • May 2005
          • 103

          #5
          hehe well, on linux cpu_util is not supported. Maybe you are using the wrong item?

          Comment

          • bbrendon
            Senior Member
            • Sep 2005
            • 870

            #6
            Can we merge this thread with the following thread?
            Unofficial Zabbix Expert
            Blog, Corporate Site

            Comment

            • Alexei
              Founder, CEO
              Zabbix Certified Trainer
              Zabbix Certified SpecialistZabbix Certified Professional
              • Sep 2004
              • 5654

              #7
              Combination of agent.ping and nodata() should work perfectly for all ZABBIX agents regardless of underlying platform.
              Alexei Vladishev
              Creator of Zabbix, Product manager
              New York | Tokyo | Riga
              My Twitter

              Comment

              • bbrendon
                Senior Member
                • Sep 2005
                • 870

                #8
                The problem I have is that if there are a lot of events being grabbed from the windows eventlog, the agent doesn't do ANYTHING until its done downloaded eventlogs. This could take 5, 10, 15 or 60 minutes. During that time, the server gets marked as DOWN in zabbix. [This happens with other items as well, but mostly the eventlog]

                Is there a solution to this?

                It would be GREAT if i could determine the availability of all the enabled items relating to a server. Is this somehow possible? I can't think of a way
                Alexei- I don't see how that solves the above described problem. Combining nodata with agent.ping seems to do the same as combining nodata with the agent grabbing an item value. Is there a way to do something like server.<all items>.nodata?
                Unofficial Zabbix Expert
                Blog, Corporate Site

                Comment

                • Alexei
                  Founder, CEO
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Sep 2004
                  • 5654

                  #9
                  Originally posted by infinity005
                  Alexei- I don't see how that solves the above described problem. Combining nodata with agent.ping seems to do the same as combining nodata with the agent grabbing an item value. Is there a way to do something like server.<all items>.nodata?
                  Why? You have items refreshed every 5 seconds, some items are refreshed every hour. How would you define the nodata() for ALL items?

                  My suggestion is very simple: use agent.ping on all systems with a reasonable refresh rate (say, 30 seconds); and define a trigger wich would fire alerts if there is no data coming from the agent.ping within 2 minutes.

                  Simple and efficient.
                  Alexei Vladishev
                  Creator of Zabbix, Product manager
                  New York | Tokyo | Riga
                  My Twitter

                  Comment

                  • bbrendon
                    Senior Member
                    • Sep 2005
                    • 870

                    #10
                    I completely agree except what if there are MANY MANY events in the eventlog?

                    Zabbix will spend forever and ever trying to download the events and will do nothing else except download events. By that time, it has been 30 mins and the agent.ping item hasn't ran and thus the server generates a false positive action of being down.
                    Unofficial Zabbix Expert
                    Blog, Corporate Site

                    Comment

                    • Alexei
                      Founder, CEO
                      Zabbix Certified Trainer
                      Zabbix Certified SpecialistZabbix Certified Professional
                      • Sep 2004
                      • 5654

                      #11
                      Originally posted by infinity005
                      I completely agree except what if there are MANY MANY events in the eventlog?

                      Zabbix will spend forever and ever trying to download the events and will do nothing else except download events. By that time, it has been 30 mins and the agent.ping item hasn't ran and thus the server generates a false positive action of being down.
                      What events are you talking about?! Calculation of nodata() is very efficient and it does not use events!
                      Alexei Vladishev
                      Creator of Zabbix, Product manager
                      New York | Tokyo | Riga
                      My Twitter

                      Comment

                      • bbrendon
                        Senior Member
                        • Sep 2005
                        • 870

                        #12
                        When using a windows agent, there is an eventlog key (e.g. eventlog[Application]) That collects the eventlogs from the windows server and dumps them into the zabbix database.

                        Utilizing this feature will cause to zabbix agent to appear to hang. Does that make it clear?

                        On another note...
                        I was actually hoping other people would chime in with their experience. I have also seen in the past where agents reporting to the zabbix server from the internet will sometimes not report on the item I have assigned to associate with nodata for determining the host availability. Other items do populate, but one or two don't for a short time. This is strange behavior, which I've never quite understood, but would be eliminated if there was a way to do nodata for all items associated with a host.

                        Solutions that would solve all mentioned quirks:
                        • Don't monitor hosts over the internet & don't use eventlog key - not happening
                        • Write a daemon (perl?) that creates/simulates a nodata function that applies to all items on a host. This daemon would run at the database level.
                        • Alexei gets creative
                        • Other suggestions...


                        Does anyone else experience this stuff?!?!
                        Unofficial Zabbix Expert
                        Blog, Corporate Site

                        Comment

                        • Alexei
                          Founder, CEO
                          Zabbix Certified Trainer
                          Zabbix Certified SpecialistZabbix Certified Professional
                          • Sep 2004
                          • 5654

                          #13
                          Originally posted by infinity005
                          Alexei gets creative
                          I seems to be very creative when it comes to avoiding fixing stuff

                          Seriously if the agent hangs for whatever reason, the nodata() function will tell about this almost immediately. Calculation of nodata() related triggers does not depend on availability of agent.
                          Alexei Vladishev
                          Creator of Zabbix, Product manager
                          New York | Tokyo | Riga
                          My Twitter

                          Comment

                          • bbrendon
                            Senior Member
                            • Sep 2005
                            • 870

                            #14
                            Originally posted by Alexei
                            Seriously if the agent hangs for whatever reason, the nodata() function will tell about this almost immediately. Calculation of nodata() related triggers does not depend on availability of agent.
                            The agent doesn't actually hang. It just spends all its time downloading eventlogs (appearing to hang at first) and the other items for the host don't get data causing nodata to trigger! does that make sense?

                            Do we have ideas on a great solution?
                            Unofficial Zabbix Expert
                            Blog, Corporate Site

                            Comment

                            • Alexei
                              Founder, CEO
                              Zabbix Certified Trainer
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Sep 2004
                              • 5654

                              #15
                              Originally posted by infinity005
                              The agent doesn't actually hang. It just spends all its time downloading eventlogs (appearing to hang at first) and the other items for the host don't get data causing nodata to trigger! does that make sense?

                              Do we have ideas on a great solution?
                              I would suggest opening a new thread to report and discuss this issue.
                              Alexei Vladishev
                              Creator of Zabbix, Product manager
                              New York | Tokyo | Riga
                              My Twitter

                              Comment

                              Working...