Ad Widget

Collapse

100% CPU utilization by zabix_agentd

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • PeterN
    Junior Member
    • Feb 2007
    • 24

    #1

    100% CPU utilization by zabix_agentd

    Hi,
    Has anybody spotted it before ?
    Linux kernel 2.6.17 SMP (FC5).
    Agent Version 1.1.6 (same with 1.1.5).
    I heve got 4 identical machines (hardware spec) with 2 dual core CPUs.
    Only on one of them zabbix_agentd behaves like that.
    Any ideas ?

    And general question:
    How big can the maximum impact of running zabbix agent be ?
    I have couple of Postgres DB systems that are already havily overloaded sometimes and am affraid to plant zabbix agent on them.
    Last edited by PeterN; 16-02-2007, 12:40.
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    I have never even seen (heard of) this before! ZABBIX agent use native system calls of OS thus it requires absolute minimum of CPU/memory resources. Normally, depending on number and frequency of checks, it requires much less than 1% of CPU.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • PeterN
      Junior Member
      • Feb 2007
      • 24

      #3
      Well, just took a closer look at that.
      Is it a "top's" bug or I do not understand something ?
      And one more correction - kernel 2.6.16.
      But my senior admin already killed one of my agents when he spotted 25% CPU
      util. by the agent itself and avarage load above 6 (typical is never highier than 1.5)
      And that was on 2.6.17.
      I felt in love with zabbix and nobody likes watching his/her lover being killed ;-)

      Shall I attach strace to the process ?
      Any other tests ?

      top - 11:19:07 up 138 days, 15:48, 4 users, load average: 1.03, 1.21, 1.24
      Tasks: 146 total, 2 running, 144 sleeping, 0 stopped, 0 zombie
      Cpu0 : 1.7% us, 1.7% sy, 0.0% ni, 96.7% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st
      Cpu1 : 0.0% us, 48.8% sy, 41.2% ni, 0.0% id, 0.0% wa, 0.0% hi, 10.0% si, 0.0% st
      Cpu2 : 1.3% us, 0.7% sy, 0.0% ni, 97.7% id, 0.0% wa, 0.3% hi, 0.0% si, 0.0% st
      Cpu3 : 1.0% us, 0.3% sy, 0.0% ni, 94.7% id, 4.0% wa, 0.0% hi, 0.0% si, 0.0% st
      Mem: 2070536k total, 1948276k used, 122260k free, 243896k buffers
      Swap: 8418052k total, 68k used, 8417984k free, 733964k cached

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      30063 zabbix 30 5 2464 512 364 R 100 0.0 6826:02 zabbix_agentd
      21918 nagios 16 0 28828 3684 1144 S 1 0.2 1:14.85 nagios
      10376 cactiuse 16 0 21808 11m 4828 S 1 0.6 0:01.19 poller.php
      12697 root 16 0 6344 2084 1704 S 1 0.1 0:00.02 sshd
      12731 root 16 0 6320 1912 1568 S 1 0.1 0:00.02 sshd
      Last edited by PeterN; 16-02-2007, 13:04.

      Comment

      • Alexei
        Founder, CEO
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2004
        • 5654

        #4
        Yes, it would be very nice if you could post strace of the ZABBIX agent eating 100% CPU. I'm looking forward to seeing what he is doing
        Alexei Vladishev
        Creator of Zabbix, Product manager
        New York | Tokyo | Riga
        My Twitter

        Comment

        • PeterN
          Junior Member
          • Feb 2007
          • 24

          #5
          I have funny feeling that it is somehow related to nagios being run for a long
          time.
          Since I restarted nagios to reread config changes I can not replicate the problem.
          On my previous screenshot nagios was the second most CPU time consuming process.
          I'll let you know as soon as I spot it again and dump some strace output.

          By the way can I use 1.3.2 agent with 1.1.6 server ?
          Will 1.4 be released this month ?

          Regards
          Peter

          Comment

          • neth
            Junior Member
            • Jun 2007
            • 5

            #6
            I have the same problem since i tried to restart zabbix_agentd (1.4) after small changes in the config file today. Now the agent eats up all the cpu ressources, while not sending data to the server (which is running on the same machine) or writing anything in the logfile.

            strace -p 29514 -s 100:

            Process 29514 attached - interrupt to quit
            write(2, "zabbix_agentd [29514]: ", 23) = -1 EPIPE (Broken pipe)
            write(2, "Warning: Got SIGPIPE. Where it came from???", 43) = -1 EPIPE (Broken pipe)
            write(2, "\n", 1) = -1 EPIPE (Broken pipe)
            rt_sigreturn(0x51ab50) = -1 EPIPE (Broken pipe)
            --- SIGPIPE (Broken pipe) @ 0 (0) ---



            Any ideas?

            Comment

            • Alexei
              Founder, CEO
              Zabbix Certified Trainer
              Zabbix Certified SpecialistZabbix Certified Professional
              • Sep 2004
              • 5654

              #7
              We found a theoretical and thus possible in real life case when ZABBIX agent can eat 100% AFTER SATRTUP ONLY. This is related to handling of IPC resources. Fixed in pre 1.4.2. I would be very interested in your experience after 1.4.2 is relelased.
              Alexei Vladishev
              Creator of Zabbix, Product manager
              New York | Tokyo | Riga
              My Twitter

              Comment

              • neth
                Junior Member
                • Jun 2007
                • 5

                #8
                Problem solved.

                Somehow the zabbix_agentd logfile got write-protected which resulted in 100% cpu use of zabbix_agentd. I seems like it tried to write to this file all the time with no success...

                Maybe you can change this behaviour in the next versions.

                cheers

                Comment

                • Alexei
                  Founder, CEO
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Sep 2004
                  • 5654

                  #9
                  Originally posted by neth
                  Problem solved.

                  Somehow the zabbix_agentd logfile got write-protected which resulted in 100% cpu use of zabbix_agentd. I seems like it tried to write to this file all the time with no success...

                  Maybe you can change this behaviour in the next versions.

                  cheers
                  Thanks for reporting this problem. It is fixed in pre 1.4.2 available from http://www.zabbix.com/developers.php.
                  Alexei Vladishev
                  Creator of Zabbix, Product manager
                  New York | Tokyo | Riga
                  My Twitter

                  Comment

                  Working...