Announcement

Collapse
No announcement yet.

Monitoring multiple application instances per server

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

    Monitoring multiple application instances per server

    I've seen questions here and there how to monitor multiple instances of the same service running on the same physical machine - with nothing elegant or scalable suggested.

    -------------------------------------------------------------

    There are two solutions I see for this, both require extending Zabbix. I'd like to receive feedback on them prior to deciding what to do next.

    1) Parametrized templates

    Implementation:
    • Introduce a concept of "abstract template" or "compile-time macros".
      Upon linking with such template, a new template is generated with these macros substituted with given parameters. The generated template's attributes cannot be edited beyond what is available for a host attribute linked to a template. Upon altering the base template, the generated ones are regenerated.
    • Since such templates can be linked to others, an ability to link to a specific nested template multiple times has to be added as well (implies reference counting for correct unlinking and such).


    Pros:
    • "Natural" way of representation - everything regarding a server is within its host entry. In the hosts list, one can see all instances of all apps assigned to the machine (in "linked templates")
    • The "abstract templates" make a basis for distributable libraries of templates to monitor all sorts of apps
    • The service templates can make use of any other information available about the host: say, make triggers depend on "Ping failed"


    Cons:
    • Very complex design - a few new concepts and a whole new level of abstraction. A lot of work to implement this
    • Complicates data view greatly - a server entry becomes very large and inconvenient to view. Entries of different instances must be labelled somehow to distinguish between them
    • Any inter-host logic has to be duplicated at intra-host level


    2) Separate host entry for each service instance
    Implementation:
    • Zabbix agent (active) must be made to support gathering info on several host entries at the same time. Specification variants: comma-separated list, any entries with the IP the agent is running at (it's better not implement wildcards - host entry names shall not be compound).
    • Currently, this can be achieved by running several zabbix agents with different config files. But this requires supporting a multitude of files that are almost identical, and reconfiguration on client side is required after the list changes on the server side.


    Pros:
    • Very simple design, little changes needed
    • Distributable templates can still be compiled


    Cons:
    • The link between the "base" host entry and its service entries is lost. To add the aforementioned trigger dependency, I have to do it manually or link every service entry to a "ping" template (and receive multiple notifications if it fires unless some trickery is done)


    -------------------------------------------------------------

    So, which way shall we go?

    #2
    Instead of auxiliary "generated macros", substituting "compile-time templates" can be introduced into the process of replicating templates' entities into hosts. This simplifies updates (the items needing change can still be tracked by templateid).
    Last edited by __Vano; 24-10-2011, 20:10.

    Comment


      #3
      Originally posted by __Vano View Post
      Instead of auxiliary "generated macros", substituting "compile-time templates" can be introduced into the process of replicating templates' entities into hosts. This simplifies updates (the items needing change can still be tracked by templateid).
      I may have failed to emphasize this. The quoted suggestion eliminates the aforementioned "whole new level of abstraction" and "complex design". The 1st option becomes but an addition to the existing template linking and tracking algorithms.

      Comment


        #4
        I would love to see something like this implemenented. I have this problem too and Zabbix doesn't offer an easy solution for this.

        I finally tried using some sort of
        1) Parametrized templates
        doing a search and replace of a placeholder name for separating the template and such, for instance, I had a template with the name:

        Code:
        <xml>
        (...)
        $SERVICE - blabla blabla
        (...)
        </xml>
        And then I do something like

        Code:
        sed -s '$SERVICE/service1' template.xml > service1.xml
        And then I manually imported to Zabbix. This wasn't pretty neither functional, because if I had to make a change in the template then I'd have to replicate manually (replacing aka compiling the template and importing it) so I don't recommend this. We aren't using this anymore.

        How other monitoring solutions resolve the "multiple instances of the same application in the same host" problem?

        Comment


          #5
          Why isn't LLD suitable?

          I have multiple instance of the same application running on a box.
          I have created a simple JSON file that contains one "Record" for each instance.

          Would that be suitable?

          Obviously you can change the JSON flat file with a python/perl/bash scripts that autodiscover the services you are after.

          Comment


            #6
            Oh Sorry I didn't realise this is an old post.

            Comment


              #7
              For LLD to work it depends on what interface your items are using to be monitored. For instance, it doesn't really work well with the standard Zabbix JMX interface. This is a pain point for us at my company when it comes to using Zabbix. The example I have is if we're running multiple Tomcat instances on the same host, even with low level discovery to discover both of those instances this would just allow me to have dynamic items/triggers but really what we would need is another JMX interface for each port to be created for each instance.

              Today, the approach we have for it is just to treat each Tomcat instance as it's own host but this is also really isn't ideal. For instance we have:

              MyHost <--- Linux server, apply Linux template
              MyHost-Tomcat4500 <---- First Tomcat instance running on MyHost on port 4500 - apply Tomcat template
              MyHost-Tomcat4600 <---- Second Tomcat instance running on MyHost on port 4600 - apply Tomcat template

              So unfortunately the linking between the hosts is only through a naming convention. This does look better on the Overview screen though as I can view the same items side by side for comparison i.e. compare heap usage across all Tomcat servers.

              Comment


                #8
                Hi, I'm happy to see we're not the only one facing this issue.
                Has there been any recent development in regard to this feature?
                Thanks

                Comment


                  #9
                  Originally posted by rrupp View Post
                  For LLD to work it depends on what interface your items are using to be monitored. For instance, it doesn't really work well with the standard Zabbix JMX interface. This is a pain point for us at my company when it comes to using Zabbix. The example I have is if we're running multiple Tomcat instances on the same host, even with low level discovery to discover both of those instances this would just allow me to have dynamic items/triggers but really what we would need is another JMX interface for each port to be created for each instance.

                  Today, the approach we have for it is just to treat each Tomcat instance as it's own host but this is also really isn't ideal. For instance we have:

                  MyHost <--- Linux server, apply Linux template
                  MyHost-Tomcat4500 <---- First Tomcat instance running on MyHost on port 4500 - apply Tomcat template
                  MyHost-Tomcat4600 <---- Second Tomcat instance running on MyHost on port 4600 - apply Tomcat template

                  So unfortunately the linking between the hosts is only through a naming convention. This does look better on the Overview screen though as I can view the same items side by side for comparison i.e. compare heap usage across all Tomcat servers.
                  I've recently created a discovering rule to monitor multiple instances of a JBoss server. The discovering script uses a tool called twiddle to gather information about ports. This could also be done examining .xml config files, but for some reason I've selected twiddle to be the tool.

                  After using twiddle for gathering JBoss JNP ports, I've created a JSON object that returns those ports as data. Now, the discovering rules have filled my macros with values and I can create a Prototype Item.

                  This is my agent side script for discovering, called: zabbix_lld_jnp_ports.sh


                  Code:
                  #!/bin/bash                                                                                                                                      
                  
                  if [[ $1 == "" ]]; then
                      exit
                  fi
                  
                  SERVER_IP=$1
                  # partition name has been removed, use your own
                  PARTITION=<partition-name>
                  TWIDDLE=/opt/twiddle/twiddle.sh
                  
                  # user and password have been removed, use your own
                  JNP_PORTS=$(${TWIDDLE} -s ${SERVER_IP} -u <user> -p <password> get "jboss:service=HAPartition,partition=${PARTITION}" CurrentView | \
                             grep -o "${SERVER_IP}:[0-9]*" | \
                             cut -d':' -f 2)
                  
                  JSON=""
                  for PORT in ${JNP_PORTS};
                  do
                      SERVER_NAME=$(${TWIDDLE} -s ${SERVER_IP}:${PORT} -u <user> -p <password> get --noprefix "jboss.system:type=ServerConfig" ServerName)
                      JSON="${JSON:+$JSON,\n}\t\t{ \"{#JBOSS_INSTANCE}\":\"${SERVER_NAME}\", \"{#JNP_PORT}\":\"${PORT}\", \"{#AJP_PORT}\":\"$((PORT+6910))\"}"
                  done
                  
                  JSON="{\n\t\"data\": [\n${JSON}\n\t]\n}\n"
                  
                  echo -e ${JSON}
                  That produces items for each data returned by the JSON. For example, if you have a JSON like this one:

                  Code:
                  {
                  	"data": [
                  		{ "{#JBOSS_INSTANCE}":"instance1", "{#JNP_PORT}":"1399"}, "{#AJP_PORT}":"8309"},
                  		{ "{#JBOSS_INSTANCE}":"instance2", "{#JNP_PORT}":"1299"}, "{#AJP_PORT}":"8209"}
                  	]
                  }
                  And this is my UserParameter config in zabbix_agentd.conf file:

                  Code:
                  UserParameter=twiddle[*],/opt/twiddle/twiddle.sh -s <ip>:$4 -u <user> -p <password> $1 $2 $3|cut -d"=" -f2
                  UserParameter=jnp.ports.discovery[*],/opt/zabbix_lld/zabbix_lld_jnp_ports.sh $1
                  Then, I've added a discovery rule with key:

                  Code:
                  jnp.ports.discovery[{HOST.IP}]
                  The prototypes items are added then, for example, with key:
                  Code:
                  twiddle[get,"jboss.web:type=ThreadPool,name=ajp-{HOST.IP}-{#AJP_PORT}",currentThreadCount,{#JNP_PORT}]
                  It will generate 2 items with these keys:

                  Code:
                  twiddle[get,"jboss.web:type=ThreadPool,name=ajp-{HOST.IP}-8390",currentThreadCount,1399]
                  twiddle[get,"jboss.web:type=ThreadPool,name=ajp-{HOST.IP}-8290",currentThreadCount,1299]
                  Now, this could be fine to monitor one specific thing por a particular instance. If you have more than one service to monitor for each instance, it becames tedious.

                  The main problem is how to generate JSON data that doesn't repeat itself, and it can be filtered in a rule.
                  Last edited by gentunian; 11-04-2014, 22:50.

                  Comment


                    #10
                    Originally posted by __Vano View Post
                    I've seen questions here and there how to monitor multiple instances of the same service running on the same physical machine - with nothing elegant or scalable suggested.
                    [..]
                    So, which way shall we go?
                    One thing. On Zabbix database layer all such items which will be multiple instances of monitoring of the same object/application would be just normal items assigned to templates with different itemid etc.
                    If you will be able to have look on this from this end you can easily gain point with conclusion that all what you need it is set of zabbix modifications only on presentation layer. Which means in this case -> web interface.
                    Only this and nothing more ..

                    At the moment monitoring of multiple instances of the same resource on monitored host can be done by LLD which shows that problem is almost-solved .. from data (in database) representation point of view. Using LLD in such needs when you know exactly how many instances of some objects type A on Box X makes this only dynamic-like when monitored object is fully static, which makes whole approach a little artificial.

                    Again: from zabbix data representation point of view monitoring of multiple instances of some application is doable and possible even now.
                    All what is needed is only set of presentation layer solution.
                    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                    https://kloczek.wordpress.com/
                    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                    My zabbix templates https://github.com/kloczek/zabbix-templates

                    Comment


                      #11
                      Originally posted by kloczek View Post
                      Using LLD in such needs when you know exactly how many instances of some objects type A on Box X makes this only dynamic-like when monitored object is fully static, which makes whole approach a little artificial.
                      You may know exactly the amount of instances but you can scale this way with LLD. The main example of filesystems in the manual is a good example about that, but it's too simplistic. Imagine you add new server instances into a cluster, then you will need to modify the template. Using LLD to discover your resources will solve the problem for you and if you have done things right, you don't need to worry about anything.

                      I think LLD is a great useful thing that must be introduced as the main extension feature zabbix has in what concerns about templating.

                      Templates are great combining with LLD. The may disadvantage I've found is with prototypes. If you wish to disable an item created by an item prototype is simple. But if you want to disable ALL items, it's not doable.

                      Comment


                        #12
                        Originally posted by gentunian View Post
                        You may know exactly the amount of instances but you can scale this way with LLD. The main example of filesystems in the manual is a good example about that, but it's too simplistic. Imagine you add new server instances into a cluster, then you will need to modify the template. Using LLD to discover your resources will solve the problem for you and if you have done things right, you don't need to worry about anything.

                        I think LLD is a great useful thing that must be introduced as the main extension feature zabbix has in what concerns about templating.

                        Templates are great combining with LLD. The may disadvantage I've found is with prototypes. If you wish to disable an item created by an item prototype is simple. But if you want to disable ALL items, it's not doable.
                        Monitoring clustered env is different type beast.
                        In typical cluster configuration you are operating on resources and resources groups with dependencies (between resources and/or with dependencies between group as well).
                        For example you have active-standby cluster with web service resource group and if this resource group is with own IP, own storage moved between active and standby nodes) and application like apache these resources on migration to another node must be shutdown in some order defined by dependencies. First will go down apache, after this IP will taken off from network interface(s) and at the end some volume will be unmounted and SCSI LUN can be unlocked on SCSI reservation layer (because this resource will be part of the voting mechanism). If everything can go down cluster can start on second node everything in reverse order: first reserve will check votes and allocate SCSI LUN with looking then. Volume will be mounted and after this will IP be configured and at the end application like apache will be started.
                        This simple and quite typical scenario.

                        Adding monitoring to this picture is piece of cake as yet another resource in resource group will be started zabbix agent with own configuration where this agent will be bind to cluster node IP.

                        Sharing cluster node zabbix agent and clustered service zabbix agent configuration will be risky and in critical situation can make whole active-standby migration frozen in zabbix agent.

                        At the moment you cannot reload zabbix configuration to start zabbix listen on new IP or stop listening on some IP in case passive monitoring (where for example on Solaris you have exclusive IPs so applications listening on 0.0.0.0 will be not listening on these IPs by definition). Problem is even bigger when you are going to use active monitoring .. you cannot tell zabbix agent to start talking over multiple IPs to the zabbix server or proxy so using separated zabbix agent started in cluster group is only solution now. You cannot have multiple Hostname entries as well
                        In such configuration zabbix agent will be part of the resource group with dependency from IP and storage where is stored zabbix agent configuration file and/or zabbix agent binaries.

                        Nevertheless in such scenario problem with multiple instances of the same application does not exist
                        Last edited by kloczek; 12-04-2014, 23:46.
                        http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                        https://kloczek.wordpress.com/
                        zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                        My zabbix templates https://github.com/kloczek/zabbix-templates

                        Comment


                          #13
                          i know this is a bit old but it is really interesting.
                          is there any progress regarding this subject??

                          Comment


                            #14
                            Originally posted by Ammaralk View Post
                            i know this is a bit old but it is really interesting.
                            is there any progress regarding this subject??
                            What kind of progress you are expect here?
                            Depends on what you must monitor all what you need is implement exact set of metrics.
                            http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                            https://kloczek.wordpress.com/
                            zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                            My zabbix templates https://github.com/kloczek/zabbix-templates

                            Comment


                              #15
                              i am quite new to zabbix.

                              i have been working on zabbix server for two months as part of my educational phase at SAP
                              and now i have a problem with monitoring multiple instances on a single machine and i have been trying to do it for quite sometime now and until now i couldn't do it
                              the database is SAP HANA

                              thanks

                              Comment

                              Working...
                              X