Hi!
I have an issue: from time to time Java Gateway is not able to get data from Cassandra nodes. (it _may be_ related to CPU load on monitored nodes but I can't find any correlations)
For debugging I started Java Gateway in the following way, in screen on EC2 instance:
ZABBIX_VERSION="4.0.3"
JMX_PORT=10999
cd /usr/share/zabbix-java-gateway
java \
-server \
-Xms1024m \
-Xmx2048m \
-Dlogback.configurationFile=/etc/zabbix/zabbix_java_gateway_logback.xml \
-classpath lib:lib/android-json-4.3_r3.1.jar:lib/logback-classic-0.9.27.jar:lib/logback-core-0.9.27.jar:lib/slf4j-api-1.6.1.jar:bin/zabbix-java-gateway-${ZABBIX_VERSION}.jar \
-Dzabbix.pidFile=/var/run/zabbix/zabbix_java_gateway.pid \
-Dzabbix.listenIP=0.0.0.0 \
-Dzabbix.listenPort=10052 \
-Dzabbix.startPollers=10 \
-Dzabbix.timeout=30 \
-Djava.rmi.server.hostname="$(curl http://169.254.169.254/latest/meta-data/public-hostname 2>/dev/null)" \
-Dcom.sun.management.jmxremote.rmi.port=${JMX_PORT} \
-Dcom.sun.management.jmxremote.port=${JMX_PORT} \
-Dcom.sun.management.jmxremote.authenticate="false" \
-Dcom.sun.management.jmxremote="true" \
-Dcom.sun.management.jmxremote.ssl="false" \
-Djava.net.preferIPv4Stack=true \
-Dcom.sun.management.jmxremote.local.only="false" \
-Dsun.rmi.transport.tcp.responseTimeout=100000 \
com.zabbix.gateway.JavaGateway
Java gateway also configured to be JMX-enabled and I do not see any gaps in charts, but see gaps in other JMX-monitored nodes
In logs I see
2018-12-24 11:01:52.711 [pool-1-thread-2] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
2018-12-24 11:01:52.711 [pool-1-thread-2] WARN com.zabbix.gateway.SocketProcessor - error sending failure notification: Broken pipe (Write failed)
2018-12-24 11:01:53.542 [pool-1-thread-7] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
2018-12-24 11:01:53.542 [pool-1-thread-7] WARN com.zabbix.gateway.SocketProcessor - error sending failure notification: Broken pipe (Write failed)
2018-12-24 11:01:54.541 [pool-1-thread-5] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
2018-12-24 11:01:54.541 [pool-1-thread-5] WARN com.zabbix.gateway.SocketProcessor - error sending failure notification: Broken pipe (Write failed)
2018-12-24 11:02:25.239 [pool-1-thread-10] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
JMX-nodes are Cassandra, and I have configured Autodiscovery:
Key: jmx.discovery[attributes, "org.apache.cassandra.metrics:type=Table,keyspace= *,scope=*,name=ReadLatency"]
It returns ~300 items because I have many tablespaces.
So my questions are:
- What is recommended JVM settings for Java Gateway if I need to monitor ~10 000 items or more?
- What is recommended settings for
-Dzabbix.startPollers ?
-Dzabbix.timeout ?
- How many pollers is recommended to set in zabbix_server.conf
I have
StartJavaPollers=100
Any suggestions will be appreciated!
Best Regards,
Max
I have an issue: from time to time Java Gateway is not able to get data from Cassandra nodes. (it _may be_ related to CPU load on monitored nodes but I can't find any correlations)
For debugging I started Java Gateway in the following way, in screen on EC2 instance:
ZABBIX_VERSION="4.0.3"
JMX_PORT=10999
cd /usr/share/zabbix-java-gateway
java \
-server \
-Xms1024m \
-Xmx2048m \
-Dlogback.configurationFile=/etc/zabbix/zabbix_java_gateway_logback.xml \
-classpath lib:lib/android-json-4.3_r3.1.jar:lib/logback-classic-0.9.27.jar:lib/logback-core-0.9.27.jar:lib/slf4j-api-1.6.1.jar:bin/zabbix-java-gateway-${ZABBIX_VERSION}.jar \
-Dzabbix.pidFile=/var/run/zabbix/zabbix_java_gateway.pid \
-Dzabbix.listenIP=0.0.0.0 \
-Dzabbix.listenPort=10052 \
-Dzabbix.startPollers=10 \
-Dzabbix.timeout=30 \
-Djava.rmi.server.hostname="$(curl http://169.254.169.254/latest/meta-data/public-hostname 2>/dev/null)" \
-Dcom.sun.management.jmxremote.rmi.port=${JMX_PORT} \
-Dcom.sun.management.jmxremote.port=${JMX_PORT} \
-Dcom.sun.management.jmxremote.authenticate="false" \
-Dcom.sun.management.jmxremote="true" \
-Dcom.sun.management.jmxremote.ssl="false" \
-Djava.net.preferIPv4Stack=true \
-Dcom.sun.management.jmxremote.local.only="false" \
-Dsun.rmi.transport.tcp.responseTimeout=100000 \
com.zabbix.gateway.JavaGateway
Java gateway also configured to be JMX-enabled and I do not see any gaps in charts, but see gaps in other JMX-monitored nodes
In logs I see
2018-12-24 11:01:52.711 [pool-1-thread-2] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
2018-12-24 11:01:52.711 [pool-1-thread-2] WARN com.zabbix.gateway.SocketProcessor - error sending failure notification: Broken pipe (Write failed)
2018-12-24 11:01:53.542 [pool-1-thread-7] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
2018-12-24 11:01:53.542 [pool-1-thread-7] WARN com.zabbix.gateway.SocketProcessor - error sending failure notification: Broken pipe (Write failed)
2018-12-24 11:01:54.541 [pool-1-thread-5] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
2018-12-24 11:01:54.541 [pool-1-thread-5] WARN com.zabbix.gateway.SocketProcessor - error sending failure notification: Broken pipe (Write failed)
2018-12-24 11:02:25.239 [pool-1-thread-10] WARN com.zabbix.gateway.SocketProcessor - error processing request: Broken pipe (Write failed)
JMX-nodes are Cassandra, and I have configured Autodiscovery:
Key: jmx.discovery[attributes, "org.apache.cassandra.metrics:type=Table,keyspace= *,scope=*,name=ReadLatency"]
It returns ~300 items because I have many tablespaces.
So my questions are:
- What is recommended JVM settings for Java Gateway if I need to monitor ~10 000 items or more?
- What is recommended settings for
-Dzabbix.startPollers ?
-Dzabbix.timeout ?
- How many pollers is recommended to set in zabbix_server.conf
I have
StartJavaPollers=100
Any suggestions will be appreciated!
Best Regards,
Max
Comment