Ad Widget

Collapse

Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • dfavero
    Junior Member
    • Dec 2008
    • 2

    #1

    Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]

    Hello my friends,

    This message on my server log

    Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]


    The agentd doesnt report status and closed the client machine is loaded with agentd processes

    server and agentd are 1.4.2 / Debian Linux / Database postgresql-8.1

    zabbix_agentd.conf
    StartAgents=5
    DebugLevel=3

    does anyone has this same problem??

    Thanks
  • Calimero
    Senior Member
    • Nov 2006
    • 481

    #2
    You can check by yourself.

    Just telnet from server to client and type the key you want to query and press return (sends the line with \n). Client should then answer.

    Code:
    zabbix2:~ # telnet 172.0.0.3 10050
    Trying 172.0.0.3...
    Connected to 172.0.0.3.
    Escape character is '^]'.
    agent.version
    ZBXD[B]1.4.4[/B]Connection closed by foreign host.
    zabbix2:~ #

    Comment

    • martin.marcher
      Junior Member
      • Nov 2007
      • 22

      #3
      Any resolution on this?

      The ZBX_TCP_READ is reported several times as fixed in the forums as well as in Jira. However just as often people seem still to have trouble with it.

      So is there any progress in getting this to work. I'm experiencing the same stuff.

      I'm on OpenVZ and have these problem with just 2 hosts. System is debian etch, zabbix is compiled with the (manually) backported version from sid (Version 1.6.1)

      Code:
      # df -h
      Filesystem            Size  Used Avail Use% Mounted on
      tmpfs                 4.9G     0  4.9G   0% /lib/init/rw
      tmpfs                 4.9G  4.0K  4.9G   1% /dev/shm
      simfs                  20G  618M   20G   4% /
      Code:
      # free -m
                   total       used       free     shared    buffers     cached
      Mem:           906        365        541          0          0          0
      -/+ buffers/cache:        365        541
      Swap:            0          0          0
      Code:
      # zabbix_server -V
      ZABBIX Server (daemon) v1.6.1 (04 November 2008)
      Compilation time:  Dec 19 2008 13:09:27
      Code:
      # zabbix_agent -V
      ZABBIX Agent v1.6.1 (04 November 2008)
      Compilation time:  Dec 19 2008 13:08:24
      Code:
      # egrep -v '^$|^#' /etc/zabbix/zabbix_agentd.conf
      Server=its-zabmaster01.in.inqnet.at,localhost
      Hostname=its-zabmaster01
      StartAgents=5
      DisableActive=1
      DebugLevel=4
      PidFile=/var/run/zabbix-agent/zabbix_agentd.pid
      LogFile=/var/log/zabbix-agent/zabbix_agentd.log
      LogFileSize=5
      Timeout=10
      Code:
      # egrep -v '^$|^#' /etc/zabbix/zabbix_server.conf
      StartPollers=5
      StartPollersUnreachable=2
      StartTrappers=5
      StartPingers=5
      StartDiscoverers=1
      StartHTTPPollers=5
      HousekeepingFrequency=24
      SenderFrequency=30
      DebugLevel=3
      Timeout=5
      TrapperTimeout=5
      UnreachablePeriod=45
      UnavailableDelay=15
      PidFile=/var/run/zabbix-server/zabbix_server.pid
      LogFile=/var/log/zabbix-server/zabbix_server.log
      LogFileSize=5
      AlertScriptsPath=/etc/zabbix/alert.d/
      FpingLocation=/usr/sbin/fping
      PingerFrequency=60
      DBHost=localhost
      DBName=zabbix
      DBUser=zabbix
      DBPassword=supersecret
      This is repeating all over for all hosts and items I'd like to collect info...
      Code:
        2001:20081222:191229 Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]
        2002:20081222:191229 Timeout while answering request
        2002:20081222:191229 Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]
        2001:20081222:191229 Host [Its-zabmaster01]: first network error, wait for 15 seconds
        2002:20081222:191229 Host [Its-zabmaster01]: first network error, wait for 15 seconds
        2001:20081222:191229 Parameter [agent.ping] will be checked after 120 seconds on host [Its-zabmaster01]
        2002:20081222:191229 Parameter [agent.version] will be checked after 120 seconds on host [Its-zabmaster01]
      Code:
      # zabbix_get -s localhost -k agent.version
      zabbix_get [5399]: Timeout while executing operation.
      Code:
      telnet its-zabmaster01 10050
      Trying 10.10.141.14...
      Connected to its-zabmaster01.in.inqnet.at.
      Escape character is '^]'.
      get_me_something
      ^]
      telnet> Connection closed.
      Code:
      telnet its-zabmaster01 10051
      Trying 10.10.141.14...
      Connected to its-zabmaster01.in.inqnet.at.
      Escape character is '^]'.
      do_something
      Connection closed by foreign host.
      Code:
      # telnet localhost 10050
      Trying 127.0.0.1...
      Connected to localhost.localdomain.
      Escape character is '^]'.
      get_me_something
      ^]
      telnet> Connection closed.
      Code:
      # netstat -tulpen|grep -i zabb
      tcp        0      0 0.0.0.0:10050           0.0.0.0:*               LISTEN     107        27424072   15714/zabbix_agentd
      tcp        0      0 0.0.0.0:10051           0.0.0.0:*               LISTEN     107        28237445   1997/zabbix_server
      Code:
      # telnet localhost 10051
      Trying 127.0.0.1...
      Connected to localhost.localdomain.
      Escape character is '^]'.
      do_something
      Connection closed by foreign host.

      Comment

      • elpollodiablo
        Junior Member
        • Dec 2008
        • 1

        #4
        This is not a problem.

        If you get this error, there is a problem with client communication and you run into the timeout for client connections.

        Try debugging communication with your clients, using the zabbix command line client/telnet for testing the retrieval of values from all of your clients; perhaps some tcpdump too.

        In our case, it was caused by a second host for active checking in the zabbix_agent.conf (iirc), and the server tripped over a hanging (open but non-responsive) connection to a client.

        The interrupting signal is the timer which was set with the timeout for client connections, so no worries there

        Edit: It seems possible to me that you could get this error when the server / network is too slow to receive all client data in less time than timeout was set, but I can not verify this assumption.
        Last edited by elpollodiablo; 03-02-2009, 17:27.

        Comment

        • tempus1984
          Junior Member
          • May 2009
          • 10

          #5
          UP!

          Hello I have the same problem but isn't resolved
          Do somebody has an idea?

          thanks

          Comment

          • abix_adamj
            Junior Member
            • Jun 2008
            • 3

            #6
            I have similar situation with agentd on Debian 5.0.1 on OpenVZ.

            Server is Zabbix on openSUSE 11.0, works OK with other agents on other openSUSE host.
            But just one agentd on openVZ is the problem, on server I have log:
            Code:
            10803:20090713:163055 Item [VPS_Zabbix_Agentd:agent.ping] error: Get value from agent failed: Cannot connect to [91.x.y.z:10050] [Interrupted system call]
            Code:
            main:~ # fping 91.x.y.z
            91.x.y.z is alive

            I can connect from server to client with telnet:
            Code:
            main:~ # telnet vps 10050
            Trying 91.x.y.z...
            Connected to vps.
            Escape character is '^]'.
            agent.version
            ZBXD1.6.5Connection closed by foreign host.
            Any ideas? What more information should I provide ?
            Adam

            Comment

            • martin.marcher
              Junior Member
              • Nov 2007
              • 22

              #7
              I can't remember how we resolved the problem (but we did), anyway:

              Because of several Problems (UI, stability of server/client) and a general dislike we completely dropped Zabbix and went back to Nagios (I hope to evaluate Icinga soon as it will support PostgreSQL).

              I _think_ it was related to the number of zabbix_agentd's running on the monitored box (use more) - but I'm not quite sure anymore

              Comment

              • scronkey
                Junior Member
                • Aug 2008
                • 5

                #8
                I have the same problem. I started a thread yesterday (http://www.zabbix.com/forum/showthread.php?t=13202) loosely based on this issue, but more toward refining triggers to avoid false alerts.

                I have done many of the suggestions from these threads however nothing seems to have resolved this issue. I have run my own scripts which perform constant checks alongside Zabbix (on the same box) and seen no communication issues even though Zabbix will occasionally (several times a day) have the ZBX_TCP_READ issue.

                I'm quite stumped and copping quite a bit of flak from my colleagues about this, yet I'm very reluctant to stop using Zabs due to all the other strengths it has.

                This is why I've basically stopped trying to resolve the ZBX_TCP_READ problem and am now trying to refine my triggers so this issue is not so apparent.

                Having said all that, resolving this issue would be the correct and most suitable solution.

                Comment

                • tetros
                  Junior Member
                  • Jun 2009
                  • 3

                  #9
                  Zbx_tcp_read()

                  i have a lot of problem can someone help me plz this is the copy of my problem in zabbix_server.log 10x for your help

                  3978:20090717:101902 Item [ZABBIX Server:sensor[temp2]] error: Not supported by ZABBIX agent
                  3978:20090717:101904 Item [ZABBIX Server:vfs.file.cksum[/etc/inetd.conf]] error: Not supported by ZABBIX agent
                  3978:20090717:101904 Item [ZABBIX Server:vfs.file.cksum[/vmlinuz]] error: Not supported by ZABBIX agent
                  3979:20090717:101915 Item [ZABBIX Server:kernel.maxproc] error: Not supported by ZABBIX agent
                  3974:20090717:101922 Format error or unsupported operator. Exp: [PROCESS APACHE EST OK]
                  3974:20090717:101922 Format error or unsupported operator. Exp: [PROCESS APACHE EST OK]
                  3974:20090717:101922 Expression [{12948}=0] for item [23661][ZABBIX Server:check.apache[]] cannot be evaluated: Format error or unsupported operator. Exp: [PROCESS APACHE EST OK]

                  Comment

                  • MrKen
                    Senior Member
                    • Oct 2008
                    • 652

                    #10
                    You should have started another thread because your problem appears to be different from the others above.

                    The first 4 'errors' are simply not supported on your server. This is normal.

                    Type ./zabbix_agentd -p (probably from /usr/local/sbin ??)

                    You will see a list of which parameters are supported on your server.

                    The other 'errors', sorry I don't know.

                    MrKen
                    Disclaimer: All of the above is pure speculation.

                    Comment

                    • camel1
                      Junior Member
                      • Nov 2009
                      • 1

                      #11
                      I had a similar issue:
                      From zabbix server.log
                      2058:20091217:075509 Item [sw.here.c:net.tcp.service[ssh]] error: Get value from agent failed: ZBX_TCP_READ() failed [Interrupted system call]

                      It appears that it is not a communication error between the zabbix server and client but an error with the zabbix agent on the client itself:
                      /usr/local/zabbix-agent/sbin/zabbix_agentd -p | grep 127
                      net.tcp.dns[127.0.0.1,localhost] [u|1]
                      net.tcp.service[ssh,127.0.0.1,22] [u|0]
                      net.tcp.service.perf[ssh,127.0.0.1,22] [d|0.000000]

                      after editing /etc/hosts.allow for 127.0.0.1:sshd
                      usr/local/zabbix-agent/sbin/zabbix_agentd -p | grep 127
                      net.tcp.dns[127.0.0.1,localhost] [u|1]
                      net.tcp.service[ssh,127.0.0.1,22] [u|1]
                      net.tcp.service.perf[ssh,127.0.0.1,22] [d|0.022210]

                      The status has changed from 0 to 1 and the error has gone away.

                      This seems wierd because other hosts (although that all have 127.0.0.1 open on their host firewall) do not have a tcp wrapper entry in hosts.allow and would reply with ssh_exchangeidwhatever_error or just timeout, do not cause an event in the zabbix GUI.

                      This leads me to beleive its either 1) a timeout issue or 2) a bug. When running the agentd -p command on the clients they do report ssh_server as 0 but these events are not flagged by the zabbix server.

                      Hopefully this helps somewhat...

                      Comment

                      • init0
                        Junior Member
                        • Feb 2010
                        • 5

                        #12
                        We just experienced the same issue. Suddenly the problem was there without changing anything on the Zabbix server or on the clients (using the agent). There was no maintenance work/upgrade done at all.

                        While digging into this problem we noticed that resolving a hostname was quite slow. Checking the nameservers one by one showed that the first one was not responding thus slowing the whole communication down. Until the first nameserver timed out and the query was sent to the second the Zabbix timeout already kicked in.

                        After we put the not responding nameserver to the end of /etc/resolv.conf the situation immediately recovered. Make sure you have at least two working nameservers configured.

                        Conclusion:
                        - Use (if possible) an IP address instead of a DNS name (for agent and the config in the Web UI)
                        - You may increase the timeout settings for Zabbix (but we prefer having a working DNS infrastructure)
                        - Make sure your DNS infrastructure is working fine
                        - Think about installing a DNS caching software

                        Comment

                        • anrstone
                          Member
                          • Oct 2009
                          • 61

                          #13
                          I've been experiencing this problem a lot having upgraded to 1.8.2. The problem has manifested itself in a number of ways but the most common problem was that we got a number of items failing to report properly with the errors mentioned throughout this thread - this problem was particularly an issue on Windows boxes with multiple IP addresses (i.e. all our web servers!).

                          The solution to the problem was to use the newer features in the 1.8.2 config to fix the listening and source response IPs in the agent config to be the same as the one entered in the Zabbix server. For those of you using win2k8 there is an additional problem with DNS which is that by default it registers all IPs for it's given name with the DNS server not just the primary IP - this can give huge problems if you are trying to resolve by DNS as the IP->DNS, DNS->IP will not always resolve to match.

                          Comment

                          • yas
                            Junior Member
                            • Apr 2010
                            • 25

                            #14
                            Get value from agent failed: Cannot connect to xxx.xx.xx.x:10050] [Interrupted syste

                            hhi there , i am so tired plz help me if u can this message obtain me when i added a new host on my zabbix server, i installed zabbix ver. 1.8.2 and i don't know what can i do to fix this problem please help me i have this case upper then 2 weeks

                            Comment

                            • anrstone
                              Member
                              • Oct 2009
                              • 61

                              #15
                              Have you tried telnetting from the Zabbix server to the agent on the IP you've set-up?

                              Secondly have you checked that it's listening on the IP the server is trying to connect to?

                              Is the hostname in the agent config the same as that on the Zabbix Server - it needs to be exactly the same?

                              I'm sure you've done all this already so sorry if I'm not being helpful - do you want to post up some more details?

                              Comment

                              Working...