Ad Widget

Collapse

Troubleshooting Agent item on host failed: first network error

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Sudain
    Junior Member
    • Aug 2016
    • 3

    #1

    Troubleshooting Agent item on host failed: first network error

    Hello!

    I am running into an issue where I've created a new check (trying to leverage a sipp script) and no luck getting zabbix to run it automatically. I've run zabbix_agentd -t sipp.call 17120 17120 6foXUKxe which invokes the script properly at the agent level, so I think it's something at the server level getting server level invoking the agent. This particular agent and server are on the same box.

    I have several other user scripts on the same box, configured with similar parameters that work just fine. /var/etc/zabbix/zabbix_agentd.conf
    UserParameter=snmp.test[*],/etc/zabbix/snmptest.bsh $1 $2 $3 $4 $5
    UserParameter=sipp.call[*],/etc/zabbix/sippPhoneCall.bsh $1 $2 $3

    In ( /var/log/zabbix-server/zabbix_server.log ) I see the below.
    11084:20160816:154147.306 Zabbix agent item [sipp.call[17120,17120,6foXUKxe]] on host [Zabbix server] failed: first network error, wait for 15 seconds
    11099:20160816:154205.065 Zabbix agent item [sipp.call[17120,17120,6foXUKxe]] on host [Zabbix server] failed: another network error, wait for 15 seconds
    11101:20160816:154220.069 resuming Zabbix agent checks on host [Zabbix server]: connection restored
    11082:20160816:154347.292 Zabbix agent item [sipp.call[17120,17120,6foXUKxe]] on host [Zabbix server] failed: first network error, wait for 15 seconds
    11099:20160816:154405.083 Zabbix agent item [sipp.call[17120,17120,6foXUKxe]] on host [Zabbix server] failed: another network error, wait for 15 seconds
    11101:20160816:154420.159 resuming Zabbix agent checks on host [Zabbix server]: connection restored

    Zabbix Version:
    Zabbix 2.0.1 Copyright 2001-2012 by Zabbix SIA

    Any help is appreciated, I'm not sure where to look for further diagnostics. Thank you in advance,
    Robert
  • registration_is_lame
    Senior Member
    • Nov 2007
    • 148

    #2
    failed network errors can be resolved if you use these settings... at least it worked out for me. Very less failed network errors or rare.

    in zabbix_agentd.conf add these.
    StartAgents=10
    BufferSend=10
    BufferSize=150
    MaxLinesPerSecond=100
    Timeout=20

    restart agent
    not sure about the other issues. I'd say enable debugging mode to 4 and copy and paste last 20 lines of the log here. maybe i can help. This community is weak, whoever posts questions just refresh page every few minutes counting number of views and without any replies for days. Even if details are provided for an intersting issue, Nothing from IRC too. Most of them get their issues resoled and stay about of forums, not much contribution. Besides lots of noobie questions on forums.

    In zabbix_agentd.conf

    DebugLevel=4

    restart
    Last edited by registration_is_lame; 18-08-2016, 06:02.

    Comment

    • Sudain
      Junior Member
      • Aug 2016
      • 3

      #3
      Thank you for the help, it's appreciated.

      I might have to increase those settings as I'm getting more network errors from all the scripts on the box, not just this one. o.O I'll keep tinkering with this idea.

      Here is the section regarding the script I'm trying to debug. The rest of the log is from all sorts of other agents trying to call home.

      28524:20160818:152005.643 In zbx_waitpid()
      28524:20160818:152005.643 zbx_waitpid() exited, status:0
      28524:20160818:152005.643 End of zbx_waitpid():29135
      28524:20160818:152005.643 Run remote command [/home/zabbix/sippPhoneCall.bsh 17120 17120 6foXUKxe] Result [183] [sipp -sf /home/zabbi]...
      28524:20160818:152005.643 Sending back [sipp -sf /home/zabbix/uac.xml 129.82.254.250 -m 1 -au 17120 -ap 6foXUKxe -s 17120 -i 129.82.254.254 -timeout 10s -timeout_error -d 2000 -bg -trace_screen
      Background mode - PID=[29145]]
      28524:20160818:152005.643 Got signal [signal:13(SIGPIPE),sender_pid:28524]. Ignoring ...
      28524:20160818:152005.643 Process listener error: ZBX_TCP_WRITE() failed: [32] Broken pipe
      28531:20160818:152006.372 In send_buffer() host:'129.82.254.254' port:10051 values:0/150
      28531:20160818:152006.372 End of send_buffer():SUCCEED
      28531:20160818:152006.372 Sleeping for 1 second(s)
      28532:20160818:152006.372 In send_buffer() host:'127.0.0.1' port:10051 values:0/150
      28532:20160818:152006.372 End of send_buffer():SUCCEED
      28532:20160818:152006.372 Sleeping for 1 second(s)
      28520:20160818:152006.376 In update_cpustats()
      28520:20160818:152006.376 End of update_cpustats()

      ---- Agents trying to call home ----
      28528:20160818:152339.475 Run remote command [/etc/zabbix/snmptest.bsh .1.3.6.1.4.1.5003.11.1.1.1.1.6 .1.3.6.1.4.1.5003.11.1.1.1.1.7 trunk 0 10.174.0.13] Result [1] [4]...
      28528:20160818:152339.475 Sending back [4]
      28532:20160818:152340.411 In send_buffer() host:'127.0.0.1' port:10051 values:0/150
      28532:20160818:152340.411 End of send_buffer():SUCCEED
      28532:20160818:152340.411 Sleeping for 1 second(s)
      28531:20160818:152340.411 In send_buffer() host:'x.y.254.254' port:10051 values:0/150
      28531:20160818:152340.411 End of send_buffer():SUCCEED
      28531:20160818:152340.411 Sleeping for 1 second(s)
      28520:20160818:152340.427 In update_cpustats()
      28520:20160818:152340.427 End of update_cpustats()
      28528:20160818:152340.671 Processing request.
      28528:20160818:152340.671 Requested [snmp.test[.1.3.6.1.4.1.5003.11.1.1.1.1.6, .1.3.6.1.4.1.5003.11.1.1.1.1.7, trunk, 1, 10.174.0.13]]
      28528:20160818:152340.672 In zbx_popen() command:'/etc/zabbix/snmptest.bsh .1.3.6.1.4.1.5003.11.1.1.1.1.6 .1.3.6.1.4.1.5003.11.1.1.1.1.7 scrubbed scrubbed x.y.0.13'
      28528:20160818:152340.672 End of zbx_popen():7
      30000:20160818:152340.672 zbx_popen(): executing script
      28528:20160818:152340.750 In zbx_waitpid()
      28528:20160818:152340.750 zbx_waitpid() exited, status:0
      28528:20160818:152340.750 End of zbx_waitpid():30000
      28528:20160818:152340.750 Run remote command [/etc/zabbix/snmptest.bsh .1.3.6.1.4.1.5003.11.1.1.1.1.6 .1.3.6.1.4.1.5003.11.1.1.1.1.7 scrubbed scrubbed x.y.0.13] Result [1] [4]...
      28528:20160818:152340.750 Sending back [4]
      28532:20160818:152341.411 In send_buffer() host:'127.0.0.1' port:10051 values:0/150
      28532:20160818:152341.411 End of send_buffer():SUCCEED
      28532:20160818:152341.411 Sleeping for 1 second(s)
      28531:20160818:152341.411 In send_buffer() host:'x.y.254.254' port:10051 values:0/150
      28531:20160818:152341.411 End of send_buffer():SUCCEED
      28531:20160818:152341.411 Sleeping for 1 second(s)
      28520:20160818:152341.427 In update_cpustats()
      28520:20160818:152341.427 End of update_cpustats()

      Comment

      • Sudain
        Junior Member
        • Aug 2016
        • 3

        #4

        Per the above thread I have attached the 3 images they recommend. I hope this help explain why I'm getting the network timed out errors.

        I found a way so that my sipp script doesn't time out anymore(the script itself took too long) but the host still has trouble connecting to other agents.


        11081:20160822:144553.729 Zabbix agent item [system.cpu.util[,iowait]] on host [SipX Sterling] failed: first network error, wait for 15 seconds
        11091:20160822:144553.731 Zabbix agent item [proc.num[,,run]] on host [SipX Denver 1] failed: first network error, wait for 15 seconds
        11077:20160822:144553.817 Zabbix agent item [vfs.fs.size[/home,used]] on host [Office 365 Virtual IP] failed: first network error, wait for 15 seconds
        11086:20160822:144553.842 Zabbix agent item [system.cpu.util[,idle]] on host [Sipx6 - Canary] failed: first network error, wait for 15 seconds
        11083:20160822:144553.991 Zabbix agent item [net.if.out[eth0]] on host [Sipx2 - Canary] failed: first network error, wait for 15 seconds
        11089:20160822:144553.993 Zabbix agent item [vfs.fs.size[/var,used]] on host [Sipx5 - Canary] failed: first network error, wait for 15 seconds
        11082:20160822:144554.992 Zabbix agent item [system.localtime] on host [Sipx3 - Canary] failed: first network error, wait for 15 seconds
        11092:20160822:144554.994 Zabbix agent item [vfs.fs.inode[/,pfree]] on host [Sipx2 - Canary] failed: another network error, wait for 15 seconds
        11076:20160822:144554.994 Zabbix agent item [net.if.in[eth0]] on host [Sipx6 - Canary] failed: another network error, wait for 15 seconds
        11093:20160822:144554.995 Zabbix agent item [proc.num[]] on host [SipX Denver 1] failed: another network error, wait for 15 seconds
        11080:20160822:144554.995 Zabbix agent item [vm.memory.size[available]] on host [Sipx2 - Canary] failed: another network error, wait for 15 seconds
        11084:20160822:144554.999 Zabbix agent item [system.localtime] on host [SipX Denver 3] failed: first network error, wait for 15 seconds
        11085:20160822:144555.002 Zabbix agent item [vfs.fs.size[/,used]] on host [SipX Main 4] failed: first network error, wait for 15 seconds
        11090:20160822:144555.752 Zabbix agent item [system.boottime] on host [SipX Denver 1] failed: another network error, wait for 15 seconds
        11081:20160822:144556.731 Zabbix agent item [vfs.fs.inode[/boot,pfree]] on host [Sipx2 - Canary] failed: another network error, wait for 15 seconds
        11086:20160822:144556.845 Zabbix agent item [system.swap.size[,free]] on host [Sipx3 - Canary] failed: another network error, wait for 15 seconds
        11083:20160822:144556.999 Zabbix agent item [vfs.fs.size[/boot,used]] on host [SipX Main 4] failed: another network error, wait for 15 seconds
        11089:20160822:144557.005 Zabbix agent item [vfs.fs.size[/,used]] on host [Oncall1] failed: first network error, wait for 15 seconds
        11095:20160822:144608.036 resuming Zabbix agent checks on host [Office 365 Virtual IP]: connection restored
        11097:20160822:144608.042 resuming Zabbix agent checks on host [Sipx5 - Canary]: connection restored
        11104:20160822:144608.216 resuming Zabbix agent checks on host [SipX Sterling]: connection restored
        11103:20160822:144609.038 resuming Zabbix agent checks on host [Sipx6 - Canary]: connection restored
        11099:20160822:144609.040 resuming Zabbix agent checks on host [SipX Denver 3]: connection restored
        11102:20160822:144610.041 resuming Zabbix agent checks on host [SipX Denver 1]: connection restored
        11103:20160822:144611.044 resuming Zabbix agent checks on host [Sipx2 - Canary]: connection restored
        11095:20160822:144611.046 resuming Zabbix agent checks on host [SipX Main 4]: connection restored
        11102:20160822:144611.047 resuming Zabbix agent checks on host [Sipx3 - Canary]: connection restored
        11099:20160822:144612.047 resuming Zabbix agent checks on host [Oncall1]: connection restored
        11087:20160822:144643.116 Zabbix agent item [proc.num[,,run]] on host [SipX Sterling] failed: first network error, wait for 15 seconds
        11097:20160822:144659.218 resuming Zabbix agent checks on host [SipX Sterling]: connection restored
        11078:20160822:144702.823 Zabbix agent item [system.swap.size[,pfree]] on host [SipX Sterling] failed: first network error, wait for 15 seconds
        11099:20160822:144717.169 resuming Zabbix agent checks on host [SipX Sterling]: connection restored
        11084:20160822:145638.411 Zabbix agent item [agent.ping] on host [SipX Sterling] failed: first network error, wait for 15 seconds
        11103:20160822:145654.161 resuming Zabbix agent checks on host [SipX Sterling]: connection restored
        11091:20160822:145658.416 Zabbix agent item [sipx.proc[sipxpark]] on host [SipX Sterling] failed: first network error, wait for 15 seconds
        11084:20160822:145658.922 Zabbix agent item [system.cpu.util[,user]] on host [SipX Sterling] failed: another network error, wait for 15 seconds
        11088:20160822:145659.954 Zabbix agent item [sipx.proc[sipxprovision]] on host [SipX Sterling] failed: another network error, wait for 15 seconds
        11083:20160822:145700.216 Zabbix agent item [system.localtime] on host [SipX Sterling] failed: another network error, wait for 15 seconds
        11080:20160822:145700.971 Zabbix agent item [sipx.proc[sipxproxy]] on host [SipX Sterling] failed: another network error, wait for 15 seconds
        11092:20160822:145701.002 Zabbix agent item [sipx.proc[sipxpublisher]] on host [SipX Sterling] failed: another network error, wait for 15 seconds
        11081:20160822:145701.016 Zabbix agent item [system.swap.size[,free]] on host [SipX Sterling] failed: another network error, wait for 15 seconds
        11095:20160822:145716.227 resuming Zabbix agent checks on host [SipX Sterling]: connection restored
        Attached Files

        Comment

        Working...