Ad Widget

Collapse

failed: first network error

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bouivgre
    Junior Member
    • Dec 2011
    • 7

    #1

    failed: first network error

    Hello,

    First of all, here is my configuration : Zabbix 1.9.8 on debian squeeze with mysql.
    I have deployed a windows agent on a Windows 2008 Server and I am monitoring several parameters with it (CPU, memory, network, services ...).
    I have recently add personnal check with the UserParameter option in the zabbix_agentd.conf (about 20 new check).
    Those check only try to use "type" command to read some text files.
    "UserParameter=kpi-histo,type c:\zabbix\scripts\KPI-histo.txt"

    My problem is when I try to add all my items, at least 3 or 4 items won't check themselves. I have this kind of error on my server side :
    16212:20120119:082845.734 Zabbix agent item [kpi-histo] on host [******] failed: first network error, wait for 15 seconds
    I have tried to delete the item and create the exact same item with same properties and it worked, but another item will "crash" and won't be avaialable. I have set the timeout option (agent side) to 30 and the StartAgents to 10 to make sure I have enough agents to handle those new checks.

    But I still got the probem, I have put flexible intervale to stop checking from 00:00 to 03:00 and some checks come on error after 03:00.

    I am probably missing something but I don't know what. Maybe someone got the same problem ?

    Thanks for your help.
    Last edited by bouivgre; 19-01-2012, 11:16.
  • mbrijun
    Member
    • Mar 2006
    • 63

    #2
    How does your Queue look like under "Administration > Queue"? Are there many items that have not been updated for 5 or 10 minutes?

    Comment

    • bouivgre
      Junior Member
      • Dec 2011
      • 7

      #3
      I have 7 items in my queue between 5 and 10 min.

      Comment

      • mbrijun
        Member
        • Mar 2006
        • 63

        #4
        Perhaps you could work around the problem by rolling all these checks into one bigger check and writing a custom script to handle it. Then you could simply call this custom script from the Zabbix server and get the result. As long as the result comes back within the TIMEOUT period as set in zabbix_server.conf, you should be OK. If it takes more than TIMEOUT, you can either adjust the timeout, or use zabbix_sender to submit the results back to the server.

        Comment

        • bouivgre
          Junior Member
          • Dec 2011
          • 7

          #5
          I don't think I can handle it with on big script because each item give me an information about some different stuff so I need each items.
          I am not sure why I can get it working some times and the day after it just don't want to work. I think there is a parameter somewhere to define how much items check can handle the agent. I have learned that each poller can handle 350 request, I am really far away of this number.
          My checks are every minutes, I have tried to make them at different time (like 55 sec, 65 sec ...) but same issue.

          Comment

          • mbrijun
            Member
            • Mar 2006
            • 63

            #6
            Maybe you are polling too frequently? Does it really have to be every minute or so? Perhaps 3-5 mins will be enough?

            Comment

            • bouivgre
              Junior Member
              • Dec 2011
              • 7

              #7
              I have tried to put a 5 min timer but I have the same issue. For exemple I have a checked that pulled every 10 sec yesterday (I have tested a quick check to see if it is working aswell), and today, since 3am, no check.
              I don't think it's really a problem of too many checks.

              I just have the message below since this morning. It was working fine yesterday.

              Zabbix agent item [kpi-histo] on host [******] failed: another network error, wait for 15 seconds

              Comment

              • mbrijun
                Member
                • Mar 2006
                • 63

                #8
                you can manually test a key. Telnet to the agent from the server and type in the key:

                Code:
                telnet ip_of_agent 10050
                
                kpi-histo
                Last edited by mbrijun; 19-01-2012, 16:30.

                Comment

                • bouivgre
                  Junior Member
                  • Dec 2011
                  • 7

                  #9
                  The key is working properly. But the item fail to reach the key on day and can the other day.

                  Here is an example of my latest data for an item check every 60 seconds.

                  [2012.Jan.20 09:25:19] *****
                  [2012.Jan.20 08:14:19] *****
                  [2012.Jan.20 03:27:24] *****
                  Last edited by bouivgre; 20-01-2012, 10:56.

                  Comment

                  • mortenkallesoee
                    Junior Member
                    • Jul 2012
                    • 2

                    #10
                    Did you find a solution for this?
                    i am seeing the same thing, zabbix_get is working fine, (it can return hostname, and its the hostname i expected to see)
                    Ping is ok.
                    telnet to agent on 10050 is ok, from agent to server on 10051, is ok

                    tcp traffic looks fair (have not debugged this, since no indication on a connectivity problem)

                    Comment

                    • mortenkallesoee
                      Junior Member
                      • Jul 2012
                      • 2

                      #11
                      a-ha
                      i solved my problem

                      i restrated the zabbix server.
                      my guess is that it had a cached hostname for my host, and maybe it had gotten it from NSCD.
                      (yes, i had recreated the server in AWS)

                      Comment

                      • churari
                        Junior Member
                        • Feb 2012
                        • 7

                        #12
                        Fix

                        In case this helps anyone. I fixed this as well. I was first getting the ZBX_UNSUPPORTED error due to timeout. I updated the timeout on my agent config to 25 seconds (I am using modems with very slow connection speeds). That *almost* fixed it. The item was then reporting as green, but no data was showing up. I then updated my /etc/zabbix/zabbix_server.conf file timeout parameter to match the agent and restarted the server. This fixed my issue.

                        Hope that helps someone.

                        Comment

                        • alp
                          Member
                          • Nov 2009
                          • 90

                          #13
                          Originally posted by mortenkallesoee
                          Did you find a solution for this?
                          i am seeing the same thing, zabbix_get is working fine, (it can return hostname, and its the hostname i expected to see)
                          Ping is ok.
                          telnet to agent on 10050 is ok, from agent to server on 10051, is ok

                          tcp traffic looks fair (have not debugged this, since no indication on a connectivity problem)
                          I have a same issue =( Everything is ok, zabbix_get is working fine and telnet too, but shity proxy send to log:
                          Code:
                            2833:20160707:042411.593 Zabbix agent item "mysql.status[Com_insert]" on host "Game-DB" failed: another network error, wait for 15 seconds
                            2833:20160707:042426.597 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042426.668 Zabbix agent item "mysql.ping" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042441.691 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042441.754 Zabbix agent item "mysql.ping" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042456.764 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042456.811 Zabbix agent item "mysql.ping" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042511.884 Zabbix agent item "mysql.status[Com_rollback]" on host "Game-DB" failed: another network error, wait for 15 seconds
                            2833:20160707:042526.888 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042526.960 Zabbix agent item "mysql.status[Com_insert]" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042541.967 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042542.047 Zabbix agent item "mysql.status[Bytes_received]" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042557.057 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042557.142 Zabbix agent item "mysql.status[Bytes_received]" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042612.144 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042613.223 Zabbix agent item "mysql.ping" on host "Game-DB" failed: first network error, wait for 15 seconds
                            2833:20160707:042628.226 resuming Zabbix agent checks on host "Game-DB": connection restored
                            2833:20160707:042628.245 Zabbix agent item "mysql.status[Bytes_received]" on host "Game-DB" failed: first network error, wait for 15 seconds
                          How can i fix it? I'll incrase Timeout to 30 and pollers to 100, but still getting error every 15 seconds...

                          Comment

                          Working...