Ad Widget

Collapse

How to configure Zabbix to monitor individual processes?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • rcollier
    Member
    • Sep 2013
    • 53

    #1

    How to configure Zabbix to monitor individual processes?

    Hello everyone,

    I am running zabbix server on CENTOS 6.4 and the majority of my zabbix agents are running on AIX 6.1 servers.

    I am interested in learning how to configure a zabbix agent to collect data on a particular process so that I can monitor an application to ensure it's up and running.

    What I've read so far seems to suggest that the best way to do this is to configure the UserParameter in the zabbix_agentd.conf file.

    I've been reading this thread https://www.zabbix.com/forum/showthread.php?t=16338 and I'm still not sure what I need to do.

    How would I configure the UserParameter in the zabbix_agentd.conf file to monitor a specific application?

    My UserParamater looks like this
    Code:
    UserParameter=dom.status,ps -ef | grep dom | wc -l
    I'm not sure if that's even remotely right. If it is right, how do I configure the key?

    I'm assuming I would need to add an item under the template. What do I put in the key field so that I link the item with the agent?

    Any help would be appreciated.

    Thanks,
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #2
    Good ole AIX. I am so glad we don't have that beast in our current environment. But we did at my previous employer.

    Let me give you an example of one we did for monitoring port states.
    In zabbix_agentd.conf, you have to enter the userparameter and key, along with the command that gets your data. For monitoring port states, we put this: (Currently using this on Solaris, but same principle)

    Code:
    # kpi.net.state[status] where 
    #        status = [ TIME_WAIT, CLOSE_WAIT, BOUND, ESTABLISHED, LISTEN, IDLE ] 
    #
    
    UserParameter=kpi.net.state[*],/usr/bin/netstat -an | grep $1 | wc -l | awk '{ print $$1 }'
    You have to restart the agent for that to take effect.

    Now reference the screenshots below (in order) for these as to what has to be entered in the Zabbix frontend:
    1. This is what each item looks like at the template level
    2. A detailed view of the item for CLOSE_WAIT
    3. The output of the data from those 6 userparameters into one meaningful graph.
    Attached Files
    Last edited by tchjts1; 24-09-2013, 23:49.

    Comment

    • tchjts1
      Senior Member
      • May 2008
      • 1605

      #3
      Hmm, well you are talking about specific process monitoring, but maybe my above post will give you a little better understanding of how the steps work... so I will leave it there.

      Comment

      • rcollier
        Member
        • Sep 2013
        • 53

        #4
        Thanks for your reply, that definitely helps.

        I guess where I'm lost at is how to properly configure the item key and then the trigger for it.

        Comment

        • tchjts1
          Senior Member
          • May 2008
          • 1605

          #5
          Ok, let's take your example of

          Code:
          UserParameter=dom.status,ps -ef | grep dom | wc -l
          if you wanted to apply that to every AIX host you monitored, you would have to enter that into the zabbix_agentd.conf file on every one of your AIX hosts and restart the agents.

          Then in the Zabbix frontend, you would go to Configuration --> Templates and find your Template OS AIX and click on "Items", then "Create item". Follow my screenshots in the previous examples of this thread on where to enter your information, but you would want to change my example to use yours.
          Your key would be dom.status
          The other fields are self explanatory, I believe.

          As for a trigger, take a look at this screenshot on how I trigger for too many TIME_WAITS. You do this basically the same as the item we just created... go to your OS AIX template and this time click on "Triggers" and then create trigger...
          Attached Files

          Comment

          • rcollier
            Member
            • Sep 2013
            • 53

            #6
            Thanks for your reply tchjts1!

            I am definitely making progress. I created the item and trigger, and Zabbix is currently gathering data on that particular process. All that's left to figure out is how to configure my UserParamater and item to report information that would make it easy for management to understand.

            Thanks for your help!

            Comment

            • tchjts1
              Senior Member
              • May 2008
              • 1605

              #7
              What kind of information is it? If it is a return code of something like 0 or 1, you can make use of value mapping and change the 0 or 1 into something more meaningful.

              If it is a number of processes that are running that you are reporting on, I would just create a graph and then put it into a screen.

              May be worth your time to play around with creating a screen and putting the graph on it from one of your hosts. When selecting the graph on the screen, put a check in the "Dyanmic item" checkbox. Then when you go to Monitoring --> Screens and select that screen, you will have a dropdown list of hosts you can choose from that will populate data into that graph.

              This is where creating proper Host Groups starts to come into play, as that is one of the dropdown options to select from. Example being, as this item applies to only AIX hosts, then you would probably want a group called AIX servers. This would limit your host selection to just those servers.

              A good basic set of Host Groups that I have found to work well for me is something along these lines:
              Linux Servers - All
              Linux Servers - Dev
              Linux Servers - Prod
              Solaris Servers - All
              Solaris Servers - Dev
              Solaris Servers - Prod
              Windows Servers - All
              Windows Servers - Dev
              Windows Servers - Prod

              I then have what I call sub groups. These are groups that I put servers into in regards to the server's physical location, Application they support, and what Proxy they are reporting to. To keep the list somewhat organized, I preceded those group names with something like LOC (For Location), APP (For Application they support) and PRX (For which Proxy they report to)

              So those groups may look like this:

              APP - MSSQL
              APP - PeopleSoft
              LOC - WestDataCenter
              LOC - EastDataCenter
              PRX - ZabPrx01
              PRX - ZabPrx02

              That gives me further granularity when looking at screens, applying templates, etc. Just be aware that the more hostgroups you create, the larger (in length) your dashboard becomes.

              I obviously have too much time on my hands this morning. Hope I am not confusing your original issue.

              Comment

              • rcollier
                Member
                • Sep 2013
                • 53

                #8
                The UserParameter command returns how many processes are running for a particular application. So basically my agents are reporting a value between 1-3 depending on what the application is doing.

                This is what my graph looks like



                How would I make the data that it is reporting more meaningful? Would I just mess around with the unit type in the item?

                Comment

                • tchjts1
                  Senior Member
                  • May 2008
                  • 1605

                  #9
                  Yeah, that graph is a bit funky for reporting a whole number. What are you using for Type of Information and Data Type for that item? For "Units" you could just put something like "procs"

                  This is what I would use... this should show a whole number instead of what you are seeing now.
                  Attached Files

                  Comment

                  • rcollier
                    Member
                    • Sep 2013
                    • 53

                    #10
                    I changed the unit type to "procs" and it's reporting in whole numbers now. My information type and data type look identical to yours.

                    Comment

                    Working...