Ad Widget

Collapse

How to check JVM availability without noData() function

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mlange
    Member
    • Sep 2008
    • 78

    #1

    How to check JVM availability without noData() function

    We are monitoring a lot of Java Virtual Machines. All monitored servers have a java agent deployed which requests local MBean data and sends it back to the Zabbix server. Until now we have checked the availability via a noData() Trigger on one of the JMX items. This works good as long as the server does not queue items. If this is the case and the queue is growing the item which is the foundation for the trigger does not get called. In this case the trigger value becomes true and an alert is raised. Since the trigger is part of a template assigned to all JVMs we have a lot of false alerts.

    Who knows a clever way to check the availability without using the zabbix agent? The trigger looks like this:
    <expression>{{HOSTNAME}:jmx[java.lang:type=Runtime][Uptime].nodata(600)}=1</expression>

    So if the item jmx[java.lang:type=Runtime][Uptime] does not get called in 600s (due to queueing of items in zabbix queue) the alert is thrown. Using the count(600,1) function does not help as well - same problem.
  • untergeek
    Senior Member
    Zabbix Certified Specialist
    • Jun 2009
    • 512

    #2
    It sounds like the problem is queuing in general. Any other solution using Zabbix is going to queue, isn't it?

    We use BEA WebLogic 9.2. We looked into Zapcat but our dev team did not want to incorporate that into our custom EAR files which would grant us access beyond the barest JVM information. One of my intrepid co-workers wrote a WAR file that ran in the JVM and queried the JMX info and spit it out into a simple web page. At my behest she made a plain version. I pull that information down via wget and parse it with simple shell scripting tools and return individual values like so: echo $(grep ^$2\= $TMPFILE | awk -F\= '{print $2}')

    This effectively splits lines like HeapFreeCurrent=603693056 into just the number. We do the same thing for status JMX items that return health codes like HEALTH_OK (though we are mapping those to digits for storage concerns). Because it's a web-based query it can actively check to verify that the JVM is up and running.

    And, yes, Uptime=210668 is one of the JMX queries you can make with this solution (sorry I can't share!).

    Comment

    • mlange
      Member
      • Sep 2008
      • 78

      #3
      The web solution has existed before we started with JMX and we did not want to keep that (it was also used from Nagios). Parsing the ouput via wget and unix scripts does not sound very promising for a long-term solution. But to answer my question: we created a new template which checks the http port of the servers:

      net.tcp.service.perf[http,${HOST},$HTTP_PORT}]

      We set up a trigger which becomes true if the last five accesses return 0. This works quite good for now. One disadvantage is that the trigger is not any longer part of the JVM Template.

      Comment

      • untergeek
        Senior Member
        Zabbix Certified Specialist
        • Jun 2009
        • 512

        #4
        Yeah, the parsed solution isn't the best, but it's as fast as zapcat since the webpage is queried locally and the results are all in the JMX tables. It's seriously instantaneous.

        It's not the solution for everyone, but it is working well for us and allows us a lot of flexibility and latitude in what we monitor and how.

        Comment

        Working...