Ad Widget

Collapse

Zabbix 1.6.1 CPU-load splash

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bondbig
    Member
    • Jul 2008
    • 68

    #1

    Zabbix 1.6.1 CPU-load splash

    Hi everyone!
    My system is:
    Code:
    logger-new:~ # mysql --version
    mysql  Ver 14.12 Distrib 5.0.26, for suse-linux (i686) using readline 5.1
    logger-new:~ # uname -a
    Linux logger-new 2.6.16.60-0.21-smp #1 SMP Tue May 6 12:41:02 UTC 2008 i686 i686 i386 GNU/Linux
    logger-new:~ # cat /proc/cpuinfo
    processor       : 0
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 15
    model name      : Intel(R) Core(TM)2 Duo CPU     E6750  @ 2.66GHz
    stepping        : 11
    cpu MHz         : 2659.921
    cache size      : 4096 KB
    fdiv_bug        : no
    hlt_bug         : no
    f00f_bug        : no
    coma_bug        : no
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 10
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss nx lm constant_tsc pni cx16 lahf_lm
    bogomips        : 5326.85
    
    processor       : 1
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 15
    model name      : Intel(R) Core(TM)2 Duo CPU     E6750  @ 2.66GHz
    stepping        : 11
    cpu MHz         : 2659.921
    cache size      : 4096 KB
    fdiv_bug        : no
    hlt_bug         : no
    f00f_bug        : no
    coma_bug        : no
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 10
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss nx lm constant_tsc pni cx16 lahf_lm
    bogomips        : 5322.47
    logger-new:~ # free
                           total       used       free     shared    buffers     cached
    Mem:       2075520    1715248     360272          0      43348    1544732
    -/+ buffers/cache:     127168    1948352
    Swap:      2104472          0    2104472
    All above is running on Vmware-server 2.0.
    Sometimes (one time per hour approx.) there is a CPU-load splash:

    that leads to (monitoring does not work):


    I've only 15 hosts monitored with zabbix for now on, and i think such CPU-load is abnormal. What could be the cause of that load?
    Here is top output taken in moment of high load:
    Code:
    top - 10:40:07 up 19:13,  1 user,  load average: 2.19, 1.47, 1.09
    Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
    Cpu(s): 10.3%us,  4.8%sy,  0.2%ni, 18.0%id, 66.1%wa,  0.0%hi,  0.7%si,  0.0%st
    Mem:   2075520k total,  1727408k used,   348112k free,    43408k buffers
    Swap:  2104472k total,        0k used,  2104472k free,  1554292k cached
    
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
     3558 mysql     15   0  121m  30m 4708 S   17  1.5 195:01.00 mysqld
    12372 wwwrun    15   0 55456 8756 3748 S    3  0.4   0:01.21 httpd2-prefork
    12485 wwwrun    15   0 55444 8572 3728 S    3  0.4   0:00.21 httpd2-prefork
    12345 wwwrun    15   0 55456 8852 3832 S    3  0.4   0:01.00 httpd2-prefork
    12433 wwwrun    16   0 55464 8840 3816 S    3  0.4   0:00.61 httpd2-prefork
     3394 zabbix    21   5  2776  888  732 S    0  0.0   0:39.11 zabbix_agentd
     3395 zabbix    21   5  2776  880  724 S    0  0.0   0:39.22 zabbix_agentd
     4141 zabbix    20   5  7420 2872 1716 S    0  0.1   6:35.81 zabbix_server
     4142 zabbix    21   5  7404 2852 1704 S    0  0.1   5:38.19 zabbix_server
     4145 zabbix    20   5  7324 2864 1716 S    0  0.1   6:19.67 zabbix_server
        1 root      16   0   732  280  244 S    0  0.0   0:01.21 init
        2 root      RT   0     0    0    0 S    0  0.0   0:00.06 migration/0
        3 root      34  19     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0
        4 root      RT   0     0    0    0 S    0  0.0   0:00.02 migration/1
        5 root      34  19     0    0    0 S    0  0.0   0:00.00 ksoftirqd/1
        6 root      10  -5     0    0    0 S    0  0.0   0:00.07 events/0
        7 root      10  -5     0    0    0 S    0  0.0   0:00.12 events/1
        8 root      10  -5     0    0    0 S    0  0.0   0:00.04 khelper
        9 root      13  -5     0    0    0 S    0  0.0   0:00.00 kthread
       13 root      10  -5     0    0    0 S    0  0.0   0:00.14 kblockd/0
       14 root      10  -5     0    0    0 S    0  0.0   0:00.38 kblockd/1
       15 root      13  -5     0    0    0 S    0  0.0   0:00.00 kacpid
       16 root      13  -5     0    0    0 S    0  0.0   0:00.00 kacpi_notify
      219 root      20   0     0    0    0 S    0  0.0   0:00.00 pdflush
      220 root      15   0     0    0    0 S    0  0.0   0:07.38 pdflush
      221 root      16   0     0    0    0 S    0  0.0   0:00.00 kswapd0
      222 root      10  -5     0    0    0 S    0  0.0   0:00.00 aio/0
      223 root      11  -5     0    0    0 S    0  0.0   0:00.00 aio/1
      494 root      10  -5     0    0    0 S    0  0.0   0:00.00 cqueue/0
      495 root      11  -5     0    0    0 S    0  0.0   0:00.00 cqueue/1
      496 root      11  -5     0    0    0 S    0  0.0   0:00.00 kseriod
      540 root      11  -5     0    0    0 S    0  0.0   0:00.00 kpsmoused
     1043 root      11  -5     0    0    0 S    0  0.0   0:00.00 scsi_eh_0
     1153 root      11  -5     0    0    0 S    0  0.0   0:00.00 ata/0
     1154 root      11  -5     0    0    0 S    0  0.0   0:00.00 ata/1
     1155 root      11  -5     0    0    0 S    0  0.0   0:00.00 ata_aux
     1194 root      10  -5     0    0    0 S    0  0.0   0:00.08 reiserfs/0
    But vmware Host OS does not have such load, so real CPU is not loaded...
    Last edited by bondbig; 20-11-2008, 09:51.
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    you seem to have very short refresh periods for items, which increases load. while that does produce prettier graphs, it is not required usually

    from the top output, it can be seen that system spends a lot of time in iowait state, with mysql being the reason. given that you say it happens ~ once per hour... it most likely is housekeeper process.

    suggestions :

    1. increase refresh intervals;
    2. consider improving disk subsystem, especially regarding writing performance.
    Zabbix 3.0 Network Monitoring book

    Comment

    • bondbig
      Member
      • Jul 2008
      • 68

      #3
      Thanks for reply.
      Refresh periods are default, some of items even longer. Unneeded items are switched off, the write-to-disk is 2-3 MBps on average, in times of splashes - 7-9 MBps (all that measured on vm-host, not vm-guest). The thing is that HDD can perform much faster without any problems (i tried several benchmarking programs), and vm-host's CPU is not loaded much.
      What does "housekeeper process" mean?

      Comment

      • richlv
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2005
        • 3112

        #4
        then vm io performance sucks.
        housekeeper is a process that periodically runs, removes data that is past the configured retention period etc
        Zabbix 3.0 Network Monitoring book

        Comment

        • bondbig
          Member
          • Jul 2008
          • 68

          #5
          yes, it sucks indeed, but i'm still surprised with the situation. It seems that mysql is not suitable for vmware environment due to periodical large amount of disk access operations. Maybe i should use another DB engine?

          Comment

          • richlv
            Senior Member
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Oct 2005
            • 3112

            #6
            i doubt that would help. maybe ask vmware about any possible solutions to improve io performance (using raw partitions instead of images or whatever).

            also, you could check whether db caching in 1.6 helps any (but keep in mind that you won't be able to scale any...)
            Zabbix 3.0 Network Monitoring book

            Comment

            • bondbig
              Member
              • Jul 2008
              • 68

              #7
              raw partitions are not available when using vmware server.
              How can i check db caching?

              Comment

              • bondbig
                Member
                • Jul 2008
                • 68

                #8
                i've moved zabbix to server hardware (2xXeon QC 2.66, 4 GB RAM, 2x146 SAS raid1) and still got the same problem:

                Blank in graph of monitored server


                CPU load on zabbix server (just one CPU core used, other 7 are idle, i suppose because of mysql is not multi-threaded).

                Comment

                • richlv
                  Senior Member
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Oct 2005
                  • 3112

                  #9
                  it looks like you are retrieving items with ridiculously small refresh periods - try to increase intervals. maybe housekeeper started running at that time ? then db probably got overloaded...
                  Zabbix 3.0 Network Monitoring book

                  Comment

                  • bondbig
                    Member
                    • Jul 2008
                    • 68

                    #10
                    thanks for reply.
                    Maybe you're right, but most of refresh intervals are default (30 seconds for some, 60 seconds for other, etc.). There are several items with 5 seconds interval, but just few. The server hardware is quite powerful i think. I've just a few monitored hosts (14 to be exact), so i suppose such loads are abnormal even with short refresh intervals.

                    Comment

                    • richlv
                      Senior Member
                      Zabbix Certified Trainer
                      Zabbix Certified SpecialistZabbix Certified Professional
                      • Oct 2005
                      • 3112

                      #11
                      1. what's your new values per second count ?
                      2. what's your mysql qps (actual, not averaged by mysql);
                      3. maybe you have also been hit by problem discussed at http://www.zabbix.com/forum/showthread.php?t=11456 ?
                      Zabbix 3.0 Network Monitoring book

                      Comment

                      • bondbig
                        Member
                        • Jul 2008
                        • 68

                        #12
                        1) Required server performance, new values per second 47.3328
                        What does it mean?
                        2) # mysqladmin -uroot status|cut -f9 -d":"
                        326.101

                        Pretty much...
                        3) Seems not. I keep mysql DBs on local drive.

                        Comment

                        • richlv
                          Senior Member
                          Zabbix Certified Trainer
                          Zabbix Certified SpecialistZabbix Certified Professional
                          • Oct 2005
                          • 3112

                          #13
                          1. while i don't have useful data to compare with, seems pretty high.
                          2. well, that's the mysql own calculated value... pretty hard to use.
                          3. well, that still can mean temporary tables created ondisk

                          have you tried disabling most hosts/items (especially zabbix server itself) and testing whether the problem disappears ?
                          Zabbix 3.0 Network Monitoring book

                          Comment

                          • simix
                            Member
                            • Jul 2006
                            • 53

                            #14
                            To me those numbers don't look too bad. Maybe MySQL needs better tuning in that case. But, I have seen exactly that kind of problem when using MyISAM tables with Zabbix. I know Zabbix per default uses InnoDB but one can still convert to MyISAM. While I don't guess the server in question runs MyISAM it just looks like it does to me.

                            Comment

                            • bondbig
                              Member
                              • Jul 2008
                              • 68

                              #15
                              No, i use InnoDB tables...

                              Comment

                              Working...