Ad Widget

Collapse

Zabbix didn't page me when host died!

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • drose12
    Junior Member
    • Apr 2007
    • 27

    #1

    Zabbix didn't page me when host died!

    I'm running 1.4.2 CVS and the 1.4.1 agent.
    This weekend a host of ours died, but I didn't get any pages. Checking the Events show that all my items were returning UNKNOWN, but none of my trigggers are set for that.
    Once I brought the host back online, I did get some pages, and with it up and me manually bringing down key services I do get the pages. What gives?

    I have the following test scenario right now...I have a Virtual Machine with the zabbix agent on it and being monitored by the zabbix server...if I just go and 'Pause' the VM, which basically just drops it off the face of the earth zabbix does not tell me...I see it in the configuration->hosts Availability that the status is:

    Not available Cannot connect to [xxx.yyy:10050] [Interrupted system call]

    But my pager is all quiet...

    do I need an agent.ping == unknown -> disaster trigger?
  • drose12
    Junior Member
    • Apr 2007
    • 27

    #2
    I just upgraded to 1.4.2, and I have the same issue.
    Can anyone else reproduce this?

    Comment

    • bbrendon
      Senior Member
      • Sep 2005
      • 870

      #3
      I don't think you found a bug. It sounds like you don't have a trigger set to monitor the availability of the host. I don't use active agents, so I'm not sure how to best get the availability for them. You should find something in the forums.
      Unofficial Zabbix Expert
      Blog, Corporate Site

      Comment

      • drose12
        Junior Member
        • Apr 2007
        • 27

        #4
        Originally posted by infinity005
        I don't think you found a bug. It sounds like you don't have a trigger set to monitor the availability of the host. I don't use active agents, so I'm not sure how to best get the availability for them. You should find something in the forums.
        Ok, with a little reading and searching I figured out my problem.

        The zabbix_server shows the status as UNKNOWN when it can’t get the value from the agent, but it actually doesn’t store anything in the tables, and our triggers are looking for values in the tables. The solution is pretty simple, yet amazingly not obvious nor a default trigger.

        Right now we have an item Ping Agent, which does a agent.ping every 30 seconds…and returns 1 if it is working.
        What I needed to do was to create a trigger that says if you get no data or (UNKNOWN) for 2 mins, put up a Disaster page….this is accomplished by :

        Host Down {HOSTNAME}
        {Unix_t:agent.ping.nodata(120)}

        So now if a box just stops responding for 2 minutes straight we’ll get a page.

        Comment

        • bbrendon
          Senior Member
          • Sep 2005
          • 870

          #5
          That'll work. Very nice
          Unofficial Zabbix Expert
          Blog, Corporate Site

          Comment

          • alj
            Senior Member
            • Aug 2006
            • 188

            #6
            Originally posted by drose12
            Ok, with a little reading and searching I figured out my problem.

            The zabbix_server shows the status as UNKNOWN when it can’t get the value from the agent, but it actually doesn’t store anything in the tables, and our triggers are looking for values in the tables. The solution is pretty simple, yet amazingly not obvious nor a default trigger.

            Right now we have an item Ping Agent, which does a agent.ping every 30 seconds…and returns 1 if it is working.
            What I needed to do was to create a trigger that says if you get no data or (UNKNOWN) for 2 mins, put up a Disaster page….this is accomplished by :

            Host Down {HOSTNAME}
            {Unix_t:agent.ping.nodata(120)}

            So now if a box just stops responding for 2 minutes straight we’ll get a page.

            Will it create storm of pages when you shut down your server for 3 minutes?

            Comment

            • drose12
              Junior Member
              • Apr 2007
              • 27

              #7
              Originally posted by alj
              Will it create storm of pages when you shut down your server for 3 minutes?
              I don't think so ...

              Comment

              Working...