Ad Widget

Collapse

Linux services(processes) discovery

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • adriano
    Junior Member
    • Jan 2011
    • 26

    #1

    Linux services(processes) discovery

    Greetings,

    I'm really excited with the idea of service discovery for my Linux services (processes), but after creating the scripts and testing it, I could not manage to make it work on Zabbix Server.

    The script is here:
    Code:
    #!/bin/bash
    first=1
    pscmdvars=`ps -e --no-header | awk '{ print $4; }' | sort | uniq`
    pscmdwrds=`echo $pscmdvars | wc -w`
    
    echo -ne "{\n";
    echo -ne "\t\"data\":[\n";
    
    for x in $pscmdvars
    {
            psname=$x
    
            if [ $first == 0 ]; then echo -ne ",\n"; fi
            first=0
    
            echo -ne "\t\t{\n"
            echo -ne "\t\t\t\"{#PSNAME}\":\"$psname\"";
            echo -ne "}";
    }
    
    echo -ne "]";
    echo -ne "}\n";
    - Comment: I guess the sort & uniq commands are not needed, but I added so to minimize errors with multiple processes with same name, guessed that it may crash trying to create more than an item with the same string.
    - Comment2: The command 'ps -e' takes all processes but no threads, I'm still thinking about that, it would be nice to take threads too, not only the main process.

    After creating it I have added some lines to zabbix_agentd.conf to make it work better:
    Code:
    EnableRemoteCommands=1
    UserParameter=svcauto,/home/zabbix/scripts/svcauto.sh
    After restarting the zabbix_agentd service it now returns data from the monitoring agent, as seen below.

    [root@zabbix ~]# zabbix_agentd -c /usr/local/etc/zabbix_agentd.conf -t net.if.discovery
    Code:
    net.if.discovery                              [s|{
            "data":[
                    {
                            "{#IFNAME}":"lo"},
                    {
                            "{#IFNAME}":"eth0"}]}]
    [root@zabbix ~]# zabbix_agentd -c /usr/local/etc/zabbix_agentd.conf -t svcauto
    Code:
    svcauto                                       [t|{
            "data":[
                    {
                            "{#PSNAME}":"aio/0"},
                    {
                            "{#PSNAME}":"async/mgr"},
                    {
                            "{#PSNAME}":"ata/0"},
                    {
                            "{#PSNAME}":"ata_aux"},
                    {
                            "{#PSNAME}":"auditd"},
                    {
                            "{#PSNAME}":"awk"},
                    {
                            "{#PSNAME}":"bash"},
                    {
                            "{#PSNAME}":"bdi-default"},
                    {
                            "{#PSNAME}":"cgroup"},
                    {
                            "{#PSNAME}":"crond"},
                    {
                            "{#PSNAME}":"crypto/0"},
                    {
                            "{#PSNAME}":"events/0"},
                    {
                            "{#PSNAME}":"ext4-dio-unwrit"},
                    {
                            "{#PSNAME}":"flush-253:0"},
                    {
                            "{#PSNAME}":"fping"},
                    {
                            "{#PSNAME}":"httpd"},
                    {
                            "{#PSNAME}":"init"},
                    {
                            "{#PSNAME}":"jbd2/dm-0-8"},
                    {
                            "{#PSNAME}":"jbd2/sda1-8"},
                    {
                            "{#PSNAME}":"kacpid"},
                    {
                            "{#PSNAME}":"kacpi_hotplug"},
                    {
                            "{#PSNAME}":"kacpi_notify"},
                    {
                            "{#PSNAME}":"kauditd"},
                    {
                            "{#PSNAME}":"kblockd/0"},
                    {
                            "{#PSNAME}":"kdmflush"},
                    {
                            "{#PSNAME}":"khelper"},
                    {
                            "{#PSNAME}":"khubd"},
                    {
                            "{#PSNAME}":"khungtaskd"},
                    {
                            "{#PSNAME}":"kintegrityd/0"},
                    {
                            "{#PSNAME}":"kpsmoused"},
                    {
                            "{#PSNAME}":"kseriod"},
                    {
                            "{#PSNAME}":"ksmd"},
                    {
                            "{#PSNAME}":"ksoftirqd/0"},
                    {
                            "{#PSNAME}":"kstriped"},
                    {
                            "{#PSNAME}":"ksuspend_usbd"},
                    {
                            "{#PSNAME}":"kswapd0"},
                    {
                            "{#PSNAME}":"kthreadd"},
                    {
                            "{#PSNAME}":"kthrotld/0"},
                    {
                            "{#PSNAME}":"master"},
                    {
                            "{#PSNAME}":"md/0"},
                    {
                            "{#PSNAME}":"md_misc/0"},
                    {
                            "{#PSNAME}":"migration/0"},
                    {
                            "{#PSNAME}":"mingetty"},
                    {
                            "{#PSNAME}":"mpt/0"},
                    {
                            "{#PSNAME}":"mpt_poll_0"},
                    {
                            "{#PSNAME}":"mysqld"},
                    {
                            "{#PSNAME}":"mysqld_safe"},
                    {
                            "{#PSNAME}":"netns"},
                    {
                            "{#PSNAME}":"pciehpd"},
                    {
                            "{#PSNAME}":"pickup"},
                    {
                            "{#PSNAME}":"pm"},
                    {
                            "{#PSNAME}":"ps"},
                    {
                            "{#PSNAME}":"qmgr"},
                    {
                            "{#PSNAME}":"rsyslogd"},
                    {
                            "{#PSNAME}":"scsi_eh_0"},
                    {
                            "{#PSNAME}":"scsi_eh_1"},
                    {
                            "{#PSNAME}":"scsi_eh_2"},
                    {
                            "{#PSNAME}":"sh"},
                    {
                            "{#PSNAME}":"sort"},
                    {
                            "{#PSNAME}":"sshd"},
                    {
                            "{#PSNAME}":"svcauto.sh"},
                    {
                            "{#PSNAME}":"sync_supers"},
                    {
                            "{#PSNAME}":"udevd"},
                    {
                            "{#PSNAME}":"uniq"},
                    {
                            "{#PSNAME}":"usbhid_resumer"},
                    {
                            "{#PSNAME}":"vmmemctl"},
                    {
                            "{#PSNAME}":"watchdog/0"},
                    {
                            "{#PSNAME}":"zabbix_agentd"},
                    {
                            "{#PSNAME}":"zabbix_server"}]}]
    Note that one returns the data with
    Code:
    net.if.discovery                              [s|{
    and another with
    Code:
    svcauto                                       [t|{
    (may this be the problem?)

    I've created a template named 'tpl_svc_auto' with the following Discovery Rule.
    Code:
    Name: Service discovery test
    Type: Zabbix agent
    Key: svcauto
    Update interval (in sec): 60
    Keep lost resources period (in days): 60
    
    Filter
    Macro: {#PSNAME}
    Regexp: @Services for discovery
    And the following Item prototype.
    Name: Number of running processes $1
    Type: Zabbix agent
    Key: svcauto[{#PSNAME}]
    Type of information: ---> Tried using All items on list but no deal
    Update interval (in sec): 60
    Keep history: 90

    I've done all this stuff yesterday and applied to zabbix server, why can I make it from command line and not from Zabbix Server?

    Any ideas for troubleshooting?
    Last edited by adriano; 29-08-2013, 19:17. Reason: More info about the script
  • LenR
    Senior Member
    • Sep 2009
    • 1005

    #2
    Did you create any prototypes? I think you can use zabbix-get to test the lld script as well.

    Comment

    • steveboyson
      Senior Member
      • Jul 2013
      • 582

      #3
      Your LLD rule is correct, my self-written LLD rules also return [t]

      What you are missing is the correct item prototypes. You cannot treat the LLD rule as an item key since items return only ONE value.

      So, specify a "UserParameter=" entry in your agentd.conf and write some logic.

      Key would be something like
      mem.used[{#PSNAME}]

      The script you refer to in zabbix_agentd.conf then takes the process name as first param and returns the amount of memory in use (for example).

      By the way:
      Do you really want to monitor *ALL* processes? You will most likely catch single-shot processes which will mess up your database...
      So better define your macro to only include "wanted" processes I would suggest.

      Comment

      Working...