Ad Widget

Collapse

History Write Cache problem with high CPU load

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • treydock
    Junior Member
    • Apr 2011
    • 15

    #1

    History Write Cache problem with high CPU load

    Recently I began getting notifications from zabbix such as the following...

    Host: Zabbix server
    Less than 20% free in the history cache: PROBLEM
    Last value: 10.969925

    Problem started: 2011.08.09 10:01:23 Age: 10m
    This started a few days ago and I can't think of any changes to my server or it's configuration that could effect this. At the same time my CPU usage spiked and has stayed very high.

    The server runs on CentOS 5.6, Zabbix-1.8.6 (occurred during 1.8.5 also) using MySQL backend. The system is a VM with 2CPUs and 2GB RAM. Thus far it's performed well but ever since this history write cache problem the processor load has begun to spike.

    I've already tried doubling the default size on both "HistoryCacheSize" to 16M and "HistoryTextCacheSize" to 32M with the problem still persisting. I get notifications from zabbix many times an hour now about this problem.

    Here are a few graphs to illustrate the problem and see the config for this trigger...It's hard to see on the uploaded images but right on July 31 is when both CPU and History Write Cache began to go crazy, and the 15 minute load average has stayed around 4 (w/ 2 cores).

    The first image is the graph of the History Write Cache % Free

    The second is the config for the trigger

    The third is the CPU spike I have seen...

    Forth is the overall workload on the zabbix server.

    Here is my entire zabbix_server.conf (with sensative info removed)

    LogFile=/var/log/zabbix/zabbix_server.log
    PidFile=/var/run/zabbix/zabbix_server.pid
    DBHost=localhost
    DBName=....
    DBUser=....
    DBPassword=....
    DBSocket=/var/lib/mysql/mysql.sock
    StartPollers=10
    StartIPMIPollers=3
    StartPingers=5
    HousekeepingFrequency=3
    HistoryCacheSize=16M
    TrendCacheSize=8M
    HistoryTextCacheSize=32M
    AlertScriptsPath=/home/zabbix/bin/
    FpingLocation=/usr/sbin/fping
    Thanks
    - Trey
    Attached Files
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    apparently something changed in your setup (zabbix config, db etc). server is now unable to keep up. history cache filling up is just a result of performance problems.

    maybe you added large number of new hosts or items at that point ?
    Zabbix 3.0 Network Monitoring book

    Comment

    • treydock
      Junior Member
      • Apr 2011
      • 15

      #3
      So did some digging...found lots of these items in the zabbix server logs.

      [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [insert into history
      Is this because the history operations are becoming too large for MySQL to handle? I've not done much performance tuning with MySQL. Some quick googling showed to increase "innodb_lock_wait_timeout", but I do not know what implications that would have. Could this be resolved by making the housekeeping frequency occur more often, right now it's every 3 hours.

      Thanks
      - Trey

      Comment

      • richlv
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2005
        • 3112

        #4
        i'd suggest reducing housekeeper frequency back to 1 hour and search forum for mysql tuning hints. if you haven't tuned it at all, that will be a huge effect (increasing innodb buffer pool alone will help a lot)
        Zabbix 3.0 Network Monitoring book

        Comment

        • treydock
          Junior Member
          • Apr 2011
          • 15

          #5
          Increasing a few values in my MySQL config did the trick. CPU usage is back to normal levels and no more errors in the logs.

          I used these settings, incase anyone runs into this problem...

          query_cache_size = 256M
          query_cache_limit = 1M
          max_connections = 100
          table_cache = 256
          innodb_buffer_pool_size = 500M
          innodb_file_per_table
          thread_cache_size = 4
          These two sources proved to be very useful,





          Thanks
          - Trey

          Comment

          Working...