Ad Widget

Collapse

Is zabbix_agentd using a lot of CPU?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • iskondi
    Junior Member
    • Jan 2010
    • 6

    #1

    Is zabbix_agentd using a lot of CPU?

    Hello guys,

    I'm working on deploying Zabbix and I've run into something that surprised me, so I wanted to post and see if it's expected behavior or not.

    I have the Zabbix agent installed on a few different machines. I've added some of those machines to Zabbix and some I haven't added yet.

    When I perform a top on a machine running zabbix_agentd that HASN'T been added to the server I see:

    top - 10:53:33 up 8 days, 18:09, 1 user, load average: 2.87, 3.44, 3.65
    Tasks: 454 total, 10 running, 444 sleeping, 0 stopped, 0 zombie
    Cpu(s): 3.9%us, 0.5%sy, 0.0%ni, 86.4%id, 8.9%wa, 0.0%hi, 0.1%si, 0.0%st
    Mem: 24675600k total, 19230144k used, 5445456k free, 13908k buffers
    Swap: 0k total, 0k used, 0k free, 14752052k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    643 root 10 -5 0 0 0 S 0.0 0.0 22:28.71 kswapd1
    642 root 10 -5 0 0 0 S 0.0 0.0 20:36.20 kswapd0
    12208 zabbix 22 5 12444 1848 1716 S 0.7 0.0 8:55.28 zabbix_agentd
    4574 root 10 -5 0 0 0 S 0.0 0.0 7:41.58 xfsdatad/3
    4578 root 10 -5 0 0 0 S 0.0 0.0 7:21.71 xfsdatad/7
    4572 root 10 -5 0 0 0 S 0.0 0.0 5:55.91 xfsdatad/1
    6873 oracle 16 0 101m 36m 8720 S 0.0 0.2 4:32.35 emagent
    5704 root 10 -5 0 0 0 S 0.0 0.0 3:49.33 rpciod/15
    5698 root 10 -5 0 0 0 S 0.0 0.0 3:11.98 rpciod/9
    4582 root 10 -5 0 0 0 S 0.0 0.0 2:28.20 xfsdatad/11
    20296 perftest 18 0 784m 264m 62m S 0.7 1.1 1:57.86 java
    5690 root 10 -5 0 0 0 S 0.0 0.0 1:56.46 rpciod/1
    25066 oracle 16 0 12.7g 3.4g 3.4g D 0.3 14.6 1:49.06 oracle
    24444 oracle 15 0 12.7g 3.3g 3.3g S 0.0 14.2 1:27.35 oracle
    26776 oracle 15 0 12.7g 3.2g 3.2g S 0.0 13.5 1:25.01 oracle
    5692 root 10 -5 0 0 0 S 0.0 0.0 1:08.00 rpciod/3

    On the above machine I restarted the zabbix_agentd yesterday, so it's been running roughly 20 hours...

    On a machine that I am actively monitoring with zabbix I see:

    top - 10:54:49 up 58 days, 1:08, 3 users, load average: 0.05, 0.08, 0.07
    Tasks: 183 total, 2 running, 181 sleeping, 0 stopped, 0 zombie
    Cpu(s): 0.1%us, 1.3%sy, 0.0%ni, 97.9%id, 0.2%wa, 0.0%hi, 0.5%si, 0.0%st
    Mem: 16431944k total, 16354816k used, 77128k free, 269568k buffers
    Swap: 8388600k total, 0k used, 8388600k free, 15437968k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    2550 root 15 -5 0 0 0 S 0.0 0.0 129:00.60 kjournald2
    4185 root 39 19 0 0 0 R 0.0 0.0 47:05.04 kipmi0
    3465 root 15 -5 0 0 0 S 0.0 0.0 36:56.38 nfsd
    3462 root 15 -5 0 0 0 S 0.0 0.0 36:49.89 nfsd
    3464 root 15 -5 0 0 0 S 0.0 0.0 36:49.57 nfsd
    3467 root 15 -5 0 0 0 S 0.0 0.0 36:47.85 nfsd
    3466 root 15 -5 0 0 0 S 0.0 0.0 36:23.11 nfsd
    3463 root 15 -5 0 0 0 S 0.0 0.0 36:09.45 nfsd
    3461 root 15 -5 0 0 0 S 0.0 0.0 35:48.32 nfsd
    3468 root 15 -5 0 0 0 S 0.0 0.0 35:39.96 nfsd
    299 root 15 -5 0 0 0 S 0.0 0.0 22:57.12 kswapd0
    16002 zabbix 25 5 11728 836 704 S 2.0 0.0 20:31.08 zabbix_agentd
    25254 nobody 20 0 141m 11m 10m S 0.3 0.1 20:30.46 smbd
    5382 root 20 0 203m 17m 1288 S 0.0 0.1 6:32.49 dsm_om_connsvc3
    3750 root 20 0 105m 1456 848 S 0.0 0.0 6:16.48 nmbd
    3747 root 20 0 132m 2596 1436 S 0.0 0.0 4:30.75 smbd
    574 root 15 -5 0 0 0 S 0.0 0.0 4:04.33 kjournald

    The configuration on both machines is:

    Server=ZabbixServer
    StartAgents=3
    PidFile=/var/tmp/zabbix_agentd.pid
    LogFile=/tmp/zabbix_agentd.log
    Timeout=3
    Hostname=MyHostName
    DisableActive=1

    I'm running the default collection options that ship with the latest version of Zabbix...

    Is this normal/expected behavior? The total CPU usage seems REALLY high...

    Thanks,
    David
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    it's a bit hard to look at the misaligned output (maybe try formatting it in [code] block), but i couldn't see anything "really high" in there. which value exactly caught your attention ?
    Zabbix 3.0 Network Monitoring book

    Comment

    • iskondi
      Junior Member
      • Jan 2010
      • 6

      #3
      Sorry, you're right that is hard to read.
      What I'm seeing is that cumulative CPU time on the zabbix_agentd process is near the top on both boxes.

      It's not that the boxes themselves are over loaded but that the zabbix_agentd, even on a machine that isn't being monitored is accumulating a lot time.

      On the non-monitored machine it's up to 8 hours of time and on the monitored machine it's up to 20 hours.

      Let me figure out how to format the TOP output better but the gist of it is why is an unmonitored or monitored machine for that matter accumulating so much CPU time?

      Thanks!
      d

      Comment

      • richlv
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2005
        • 3112

        #4
        i haven't paid much attention to cumulative times agent daemon achieves - maybe the machines are so underloaded that agent stands out because of active item scheduling and such ?
        unless you see it in current load lists as the bigger load generator, i'd personally not worry much about that...
        Zabbix 3.0 Network Monitoring book

        Comment

        • henry
          Junior Member
          • Jun 2010
          • 16

          #5
          I noticed the same issue here on a stock Solaris10 zabbix_agentd (in a S10 zone).

          The box is idle at the moment and I even turned off almost all the alarming in the template but the the load still pops up from 0.02 to a 0.2!

          Also the CPU time that the agent chews up is about 120 minutes in a day or two!

          Compare that to the openfire server that's running on the same machine with a few people chatting away all day and it only accumulates 12 minutes in 5 days!

          Guess someone who understands the source code for the Solaris agent should have a look at the code and see what it's doing?

          regards, henry

          Comment

          • jerrylenk
            Member
            Zabbix Certified Specialist
            • May 2010
            • 62

            #6
            Hi henry,
            turning on or off some alarms cannot have any effect on the agent side, as this is done on the server.

            Hi iskondi,
            coul it be that the cumulated time in top is not hours but minutes:secs.mils ?

            Since I rebooted one of my servers ~1 hour ago, I had a look at it:
            There are 6 agentd processes running, 3 of which are among the top ten consumers of CPU accumulated time --
            but since reboot and start of the agentd, CPU idle time has been between 99% and 100% all along, so for me there is nothing to worry about.

            Regards, Jerry

            Comment

            • henry
              Junior Member
              • Jun 2010
              • 16

              #7
              [QUOTE=henry;67990]
              Originally posted by jerrylenk
              Hi henry,
              turning on or off some alarms cannot have any effect on the agent side, as this is done on the server.
              .
              Good point...

              well, I tried to cut down the checks to less than three, but the CPU consumption is still high:

              zabbix 14371 14370 1 Jun 24 ? 4055:37 zabbix_agentd

              On S10 SPARC local zone..

              Comment

              • trikke
                Senior Member
                • Aug 2007
                • 140

                #8
                Hi,

                any solution to this Problem, as I'm having the same issue?

                Greets
                Patrick

                Comment

                • henry
                  Junior Member
                  • Jun 2010
                  • 16

                  #9
                  Originally posted by trikke
                  Hi,

                  any solution to this Problem, as I'm having the same issue?

                  Greets
                  Patrick
                  Unfortunately not. It might need a programmer with Solaris and dtrace experience to track down where the agent chews up all that CPU.

                  Here, it racked up more than 2000 CPU minutes in a couple of months! And that doesn't seem to change if the zabbix server is not even polling the agent. Keeps the load at a minimum of 0.2, whereas before it was 0.0.

                  Cheers,
                  heinz

                  Comment

                  Working...