Ad Widget

Collapse

Gap in graphs no data from host received

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Michael0
    Member
    • Jan 2013
    • 70

    #1

    Gap in graphs no data from host received

    Hi!

    Sometimes it happens that a zabbix agent delivers no data to the zabbix server.
    What could be the root cause of this behaviour?
    The monitored server was always up and running I can find also no error entries in the log file from the affected server so far.
    On the zabbix server itself, the logfile displays me also no error entries regarding this.
  • Fullmetal8ender
    Member
    • Nov 2012
    • 81

    #2
    Check Administration - Queue.
    Run tcpdump -i any host monitoredserverIP port 10050
    wait if server and agent communicate each other within "Update interval"

    Comment

    • Michael0
      Member
      • Jan 2013
      • 70

      #3
      Hi!

      Yes the servers are communicating with each other but the problem is, that it is only sometimes that the agent delivers no data to the zabbix host server.
      Mostly during the night, so I only see the zabbix message the next morning that the agent was unavailable for 10 mins.

      The queue is also empty , so there should be no issue with it
      Last edited by Michael0; 28-01-2013, 12:37.

      Comment

      • Yello
        Senior Member
        • Apr 2011
        • 309

        #4
        Hi,
        Do the gaps tend to occur during a common time window?

        Regards,
        Dave

        Comment

        • Michael0
          Member
          • Jan 2013
          • 70

          #5
          Hi Dave!

          The outage happend at the same time window.
          But there is no backup window configured during this time.
          But only one server could re-establish the connection to the zabbix client.
          The other servers were not able to do this

          Comment

          • Yello
            Senior Member
            • Apr 2011
            • 309

            #6
            Originally posted by Michael0
            Hi Dave!

            The outage happend at the same time window.
            But there is no backup window configured during this time.
            But only one server could re-establish the connection to the zabbix client.
            The other servers were not able to do this
            Do you mean only one agent could re-establish connectivity to the zabbix server? What you say above doesn't make complete sense.

            Also, does your zabbix server run on a VM? Do you have a checks that test for response times? How do they behave in the lead up to these events?


            Regards,
            Dave

            Comment

            • Michael0
              Member
              • Jan 2013
              • 70

              #7
              Let me explain it with a screenshot:



              As you can see the gab on the screenshot, no data were received from the zabbix agent, therefore I got a notification from the server that the zabbix agent on this host was unreachable for 10mins.
              But the server was up and running, so its quite strange for me, why the zabbix agent stopped reporting to the server

              Comment

              • Yello
                Senior Member
                • Apr 2011
                • 309

                #8
                Originally posted by Michael0
                Let me explain it with a screenshot:



                As you can see the gab on the screenshot, no data were received from the zabbix agent, therefore I got a notification from the server that the zabbix agent on this host was unreachable for 10mins.
                But the server was up and running, so its quite strange for me, why the zabbix agent stopped reporting to the server
                Yes, that's what I thought you meant. It wasn't what you said though. I don't think you've answered all of my questions either.

                Anyway, here's some options on what I think might be going on based on my experience:

                1. Load or resource constraints on the zabbix server host - If the zabbix server daemons become resource starved, for whatever reason, they'll might lose data. Causes can be many. You'll tend to see the zabbix queue rise dramatically when something like this is going on. Large numbers of host items might go unsupported when this is going on.
                2. Network load - Data can't get from the agent to the server.
                3. Load or resource constraints on the agent side. In this scenario you won't see many hosts losing data.

                I mention resource constraints because ESX server environments in my experience can have a detrimental effect on data gathering. With the way VMs are shoehorned onto an ESX host and VMotioned around data collection can be adversely impacted given the real-time nature of the zabbix server itself.

                I'm not saying this is your problem but it should get you thinking in a mode that may take you towards an answer.


                Regards,
                David

                Comment

                • Michael0
                  Member
                  • Jan 2013
                  • 70

                  #9
                  Looks like the Zabbix server, which runs on a VM, was running out of hardware resources.

                  I added just for testing a additional CPU and more RAM and now the queue looks much better then before and since the hardware upgrade no zabbix agent was unreachable for more then 10min without delivering data

                  Comment

                  Working...