Ad Widget

Collapse

Triggers stays at unreachable but it isn't.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Tristan
    Senior Member
    • Feb 2008
    • 110

    #1

    Triggers stays at unreachable but it isn't.

    With my upgraded zabbix 1.6 install i have a strange problem. Yesterday one of my servers rebooted so zabbix triggered "unreachable". Perfect, but the message didn't go away.
    A agent restart doesn't help. A zabbix server process either.

    When i stop the zabbix agent on one of my other servers, zabbix triggered "unreachable" and after a few seconds after i started the agent the message goes away.

    any idea? it happens with one server. When i look at latest data i see data incoming.
  • Tristan
    Senior Member
    • Feb 2008
    • 110

    #2
    Fixed

    Fixed! stop the zabbix agent for a few minutes. After the restart the host status is correct

    Comment

    • erozen
      Junior Member
      Zabbix Certified Specialist
      • Apr 2007
      • 18

      #3
      Whenever I restart the agent, I stop it, check to make sure it has actually died, and then start it.

      It often doesn't die straight away, and if you try and start it while it's still running, things don't do go quite as planned.

      I never use the restart option of the init script - always 'stop'/'ps -afe | grep zabbix'/'start'.
      My automated scripts run 'stop'/'sleep 5'/'start' (and I still sometimes see one or two not come back up nicely)

      Comment

      • Tristan
        Senior Member
        • Feb 2008
        • 110

        #4
        Problem still exists

        You are right about the process but i have a server that reboots every night. Because we using a SAN the server is up within 1 minute. Zabbix triggers "has just been restarted" but the server stays at unreachable.

        I notice the same thing with a test server. If i kill the zabbix service(windows) and started up the server stays unreachable. I need to stop the zabbix_server process and then the problem is gone.

        The disadvantage is that all of my triggers are reset, so i cannot see howlong my server is unreachable.

        The server is 1.6 and the agent is 1.4.5 because i read about a lot of problems with the 1.6 agent.

        Comment

        • skullone
          Member
          • Mar 2007
          • 46

          #5
          Confirming issue here as well. Zabbix server 1.6 running on FreeBSD 6.2
          Monitoring Windows servers with Zabbix agentd 1.4.4
          If Windows server reboots, or otherwise has a moment its not reachable, it stays unreachable forever.

          Likewise, if a server sometimes IS unreachable, the Latest Data tab says "Host Status - Unreachable", yet the Unreachable trigger never fires and doesn't send me an e-mail about it, wtf? So, a host ISNT down, we get e-mails, and Zabbix says it's down, even though its up, and says reachable.

          If a server IS down, the opposite occurs nearly =/

          My unreachable trigger:
          ({Template_Windows:status.max(#2)}=2)|(({TRIGGER.V ALUE}=1)&({Template_Windows:status.min(#5)}=2))

          Used to work in 1.4, no longer in 1.6

          Comment

          • skullone
            Member
            • Mar 2007
            • 46

            #6
            More data on my issue.

            In the case of a server actually being down, Latest Data shows Status as being unreachable, but the trigger still says *UNKNOWN* and will not send an alert.
            No items are actively getting data either, it is fully down, I made sure of that (unplugged network cable).

            When the server comes back up, the Latest Data shows 1 (Up), good, but the trigger still says Unknown status.

            I've reverted the trigger back to default:
            {Template_Windows:status.last(0)}=2

            I see other threads mentioning this, as early back at 1.1 series, is there anything I can do to reliable monitor a hosts true status, and reliably send alerts for it?

            Comment

            • Tristan
              Senior Member
              • Feb 2008
              • 110

              #7
              I have a lot of more info about my problem.

              A few hours ago one of my esx server "get's fuckt in the head" and i need to reboot al
              the running vm's on that server.

              so i get a lot of emails about problems but something goes wrong:

              I read the following in my events about a particulair server:
              server01 triggers at 20:04 unreachable but at 20:14 it reports just been restarted.(problem)
              at 20:24 it reports just been restarted again with a status of OK

              The problem is that when i look at triggers it says Disaster since 20:04. This message doesn't go away

              i have the following trigger dependency:
              Template_Windows:Server server01 is unreachable
              Depends on:
              server01 has just been restarted
              expression: {dcdata:status. last( 0 ) }=2

              The reson for the dependency is that i don't wanna have unreachable mails when i reboot a server.

              I have restarted 25 servers and the problem occurs by 8 of them.

              Zabbix version 1.6
              agent version 1.4.5(because known troubles with 1.6 on windows?)

              Has somebody the same problem or a nice tip?

              the problem never occurs with zabbix 1.4.5

              thanx for any help

              ps. When i look at the latest data i see that everything is oke:

              Server Status 09 Oct 20:14:52 Up (0)
              Ping to the server (TCP) 10 Oct 08:38:22 Up (1)
              Last edited by Tristan; 10-10-2008, 08:45.

              Comment

              • Tristan
                Senior Member
                • Feb 2008
                • 110

                #8
                Today i've got installed the zabbix proxy so i get a few moments a lot of unreachable errors.. well, i found out that when i removed the trigger dependency the problem was gone.

                I need to test it better for sure but i have some progress now

                Comment

                Working...