Ad Widget

Collapse

JMX failed: first network error, wait for 15 seconds

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bpc-ruslan
    Member
    • Jul 2014
    • 32

    #1

    JMX failed: first network error, wait for 15 seconds

    Добрый день

    Периодически отваливается 1-2 раза в сутки мониторинг jmx на половине или даже всех хостах. В логах появляются такие сообщения:

    32741:20140910:044212.678 JMX agent item "jmx["com.mchange.v2.c3p0:type=PooledDataSource[z8kflu931wazdbw19rw4xg|2d977ced]",threadPoolNumActiveThreads]" on host "xxx" failed: first network error, wait for 15 seconds
    32729:20140910:044221.838 JMX agent item "jmx["com.mchange.v2.c3p0:type=PooledDataSource[2rxg6v94jh73691j0xnig|388b401d]",numConnections]" on host "xxx failed: first network error, wait for 15 seconds
    32731:20140910:044222.929 JMX agent item "jmx["com.mchange.v2.c3p0:type=PooledDataSource[2rxg6w94jewwlb83ledp|78c1a023]",numIdleConnections]" on host "xxx" failed: first network error, wait for 15 seconds
    32741:20140910:044233.682 JMX agent item "jmx["java.lang:type=MemoryPool,name=Perm Gen",Usage.committed]" on host "xxx" failed: first network error, wait for 15 seconds
    Имею 7 хостов на мониторинге jmx

    Настройки следующие
    /etc/zabbix/zabbix_server.conf
    StartJavaPollers=6
    Timeout=15
    /etc/zabbix/zabbix_java_gateway.conf
    START_POLLERS=12
  • enzorik
    Member
    • Feb 2014
    • 37

    #2
    Добрый день.
    Проверьте для начала zabbix_java.log на наличие ошибок.
    Также посмторите zabbix busy java poller processes % график для заббикс сервера: достаточно ли пулеров для мониторинга jmx параметров.

    Comment

    • bpc-ruslan
      Member
      • Jul 2014
      • 32

      #3
      Видно один из хостов xxx.xxx.xxx.17 с ошибкой. Других трех, которые падали в это время, почему-то в логе нету. Java поллеры загружены не сильно были, 5-10%.

      2014-09-10 04:43:00.895 [pool-1-thread-5] WARN com.zabbix.gateway.SocketProcessor - error processing request
      com.zabbix.gateway.ZabbixException: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.17; nested exception is:
      java.net.ConnectException: Connection timed out]
      at com.zabbix.gateway.JMXItemChecker.getValues(JMXIte mChecker.java:98) ~[zabbix-java-gateway-2.2.1.jar:na]
      at com.zabbix.gateway.SocketProcessor.run(SocketProce ssor.java:63) ~[zabbix-java-gateway-2.2.1.jar:na]
      at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1146) ~[na:1.6.0_32]
      at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615) ~[na:1.6.0_32]
      at java.lang.Thread.run(Thread.java:701) ~[na:1.6.0_32]
      Caused by: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.17; nested exception is:
      java.net.ConnectException: Connection timed out]
      at javax.management.remote.rmi.RMIConnector.connect(R MIConnector.java:355) ~[na:1.6.0_32]
      at javax.management.remote.JMXConnectorFactory.connec t(JMXConnectorFactory.java:268) ~[na:1.6.0_32]
      at com.zabbix.gateway.JMXItemChecker.getValues(JMXIte mChecker.java:90) ~[zabbix-java-gateway-2.2.1.jar:na]
      ... 4 common frames omitted
      Caused by: javax.naming.ServiceUnavailableException: null
      at com.sun.jndi.rmi.registry.RegistryContext.lookup(R egistryContext.java:118) ~[na:1.6.0_32]
      at com.sun.jndi.toolkit.url.GenericURLContext.lookup( GenericURLContext.java:203) ~[na:1.6.0_32]
      at javax.naming.InitialContext.lookup(InitialContext. java:409) ~[na:1.6.0_32]
      at javax.management.remote.rmi.RMIConnector.findRMISe rverJNDI(RMIConnector.java:1915) ~[na:1.6.0_32]
      at javax.management.remote.rmi.RMIConnector.findRMISe rver(RMIConnector.java:1884) ~[na:1.6.0_32]
      at javax.management.remote.rmi.RMIConnector.connect(R MIConnector.java:289) ~[na:1.6.0_32]
      ... 6 common frames omitted
      Caused by: java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.17; nested exception is:
      java.net.ConnectException: Connection timed out
      at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEnd point.java:619) ~[na:1.6.0_32]
      at sun.rmi.transport.tcp.TCPChannel.createConnection( TCPChannel.java:216) ~[na:1.6.0_32]
      at sun.rmi.transport.tcp.TCPChannel.newConnection(TCP Channel.java:202) ~[na:1.6.0_32]
      at sun.rmi.server.UnicastRef.newCall(UnicastRef.java: 340) ~[na:1.6.0_32]
      at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) ~[na:1.6.0_32]
      at com.sun.jndi.rmi.registry.RegistryContext.lookup(R egistryContext.java:114) ~[na:1.6.0_32]
      ... 11 common frames omitted
      Caused by: java.net.ConnectException: Connection timed out
      at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.6.0_32]
      at java.net.AbstractPlainSocketImpl.doConnect(Abstrac tPlainSocketImpl.java:327) ~[na:1.6.0_32]
      at java.net.AbstractPlainSocketImpl.connectToAddress( AbstractPlainSocketImpl.java:193) ~[na:1.6.0_32]
      at java.net.AbstractPlainSocketImpl.connect(AbstractP lainSocketImpl.java:180) ~[na:1.6.0_32]
      at java.net.SocksSocketImpl.connect(SocksSocketImpl.j ava:385) ~[na:1.6.0_32]
      at java.net.Socket.connect(Socket.java:546) ~[na:1.6.0_32]
      at java.net.Socket.connect(Socket.java:495) ~[na:1.6.0_32]
      at java.net.Socket.<init>(Socket.java:392) ~[na:1.6.0_32]
      at java.net.Socket.<init>(Socket.java:206) ~[na:1.6.0_32]
      at sun.rmi.transport.proxy.RMIDirectSocketFactory.cre ateSocket(RMIDirectSocketFactory.java:40) ~[na:1.6.0_32]
      at sun.rmi.transport.proxy.RMIMasterSocketFactory.cre ateSocket(RMIMasterSocketFactory.java:146) ~[na:1.6.0_32]
      at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEnd point.java:613) ~[na:1.6.0_32]
      ... 16 common frames omitted

      Comment

      • enzorik
        Member
        • Feb 2014
        • 37

        #4
        Попробуйте увеличить Timeout=30.
        Как вариант, можно мониторить zabbix_java.log на наличие ошибок и если опять таймаут -рестарт Zabbix Gateway

        Comment

        Working...