Ad Widget

Collapse

Zabbix & MySQL poor performance

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ArtemK
    Senior Member
    • May 2013
    • 232

    #1

    Zabbix & MySQL poor performance

    Hello,

    need an advice, can't figure out what happened to my zabbix server.

    HW specs (all is installed on one server):
    Code:
    1 CPU Xeon E5504 2Ghz, 4 core
    32 GB RAM
    RAID 10 4xHDD, 130Gb 10k RPM SAS disks
    Zabbix v2.2.1, NVPS ~70 (but I have a lot of trapper checks)

    Some day zabbix caches were filled up (history, text write), so I raised cache sizes, and everything worked ok about 1 week. Last Friday it happened again.

    Current settings for cache sizes:
    Code:
    CacheSize=32M
    HistoryCacheSize=512M
    TrendCacheSize=128M
    HistoryTextCacheSize=128M
    ValueCacheSize=128M
    I tried to tune mysql, but it doesn't help, current setting for mysql:
    Code:
    [mysqld]
    interactive_timeout             = 86400
    wait_timeout                    = 86400
    max_allowed_packet              = 32M
    query_cache_type                = 0
    query_cache_size                = 0
    query_cache_limit               = 4M
    thread_cache_size               = 96
    innodb_flush_log_at_trx_commit  = 2
    join_buffer_size                = 256k
    read_buffer_size                = 256k
    max_connections                 = 1024
    innodb_thread_concurrency       = 0
    read_rnd_buffer_size            = 512k
    sort_buffer_size                = 512K
    innodb_lock_wait_timeout        = 250
    thread_concurrency              = 4
    performance_schema              = on
    datadir                         = /var/lib/mysql
    socket                          = /var/lib/mysql/mysql.sock
    user                            = mysql
    symbolic-links                  = 0
    table_open_cache                = 2048
    myisam_sort_buffer_size         = 16M
    innodb_buffer_pool_size         = 26G
    innodb_flush_method             = O_DIRECT
    innodb_additional_mem_pool_size = 16M
    innodb_data_file_path=ibdata1:10M:autoextend
    innodb_log_file_size            = 512M
    innodb_log_buffer_size          = 8M
    innodb_fast_shutdown=0
    innodb_file_per_table
    skip-networking
    tmpdir=/temp
    innodb_file_format=Barracuda
    What I'm seeing now, is that cache % free for history write and text write is almost 0%, history syncer busy is almost 100%
    If I restart zabbix server, all is ok for 10-15 minutes, and then I get the same situation. I guess MySQL can't save all the data from zabbix quickly enough, but I can't undestand why that happened, i.e. we didn't add 1000k hosts last Friday nor new templates/checks to zabbix.

    I've implemented today partitioning for history and trends tables, but that doesn't helped. I'm not a DBA and kinda stuck right know, I just doesn't know, what may be wrong here. I've disabled a lot of my trapper checks in zabbix, removed custom templates from all hosts. Still nothing...

    there is such scare process in top like:
    Code:
    history syncer #6 [synced 289 items in 36.652641 sec]
    that i very slow, as I understand.

    LA is around 1.5, no active swap usage.

    I'm also see a lot of failed queries in zabbix server log:
    Code:
    4364:20140113:115159.992 [Z3005] query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update autoreg_host set listen_ip='192.168.243.154',listen_dns='host.domain.local',listen_port=10050,host_metadata='' where autoreg_hostid=6]
    Any hints/advices how to troubleshoot mysql performance? Thanks.

    some graphs:
    cache usage:

    internal processes:

    cpu util:
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #2
    What are some of your other values for key settings in zabbix_server.conf:
    StartTrappers=
    StartPollers=
    StartDBSyncers=
    StartPollersUnreachable=
    Timeout=
    Last edited by tchjts1; 13-01-2014, 20:21.

    Comment

    • ArtemK
      Senior Member
      • May 2013
      • 232

      #3
      Code:
      [root@zabbix ~]# grep -i ^start /etc/zabbix/zabbix_server.conf
      StartPollers=30
      StartIPMIPollers=1
      StartPollersUnreachable=10
      StartTrappers=30
      StartPingers=10
      StartDiscoverers=10
      StartHTTPPollers=10
      StartTimers=1
      StartDBSyncers=4
      ###
      Timeout=10
      I'm reparitioned today history and trends tables again, as I think, I modified indexes and primary keys wrong. So, I dropped tables, recreated them, made partitions again, and now restoring backup. Hopefully, tomorrow it will work.

      I'm also noticed, that history_text partition for one day takes 12Gb, probably someone added new super duper template with text-based items

      Comment

      • tchjts1
        Senior Member
        • May 2008
        • 1605

        #4
        if you are doing something like application log monitoring or system log monitoring, be sure to use 7th key parameter of "skip" to avoid pulling back entire logs that may be years old. I just ran into a similar problem as you, without using that parameter in the key.

        It has to be in the 7th position.
        Attached Files

        Comment

        • tchjts1
          Senior Member
          • May 2008
          • 1605

          #5
          During that time that I started collecting the logs without the skip parameter, you can see our similarities...
          Attached Files

          Comment

          • ArtemK
            Senior Member
            • May 2013
            • 232

            #6
            yeah, thanks for hint, I don't have any log monitoring, but a lot of trapper checks, some of them are text based items.

            Looks like there were too much data in history_text and history_uint tables, I'm not sure, where it came from, still investigating.

            iotop shows crazy disk read rate, around 80MB/s persistent read by mysql, which leads me to mysql queries for history_text and history_uint tables from zabbix like:
            Code:
            select something from history_text where clock < smth and clock > smth
            and, because of daily 12 Gb partitions they lasts forever. So, I dropped all huge partitions (yep, no history for there days ), now LA is ~0.1 and looks like everything is ok. Hope it is stable now and will not fail again.

            there is one good thing, I've implemented partitioning

            Comment

            • ArtemK
              Senior Member
              • May 2013
              • 232

              #7
              Ok, looks like this problem is solved.

              tchjts1, thank you for your support!

              cache:

              internal processes:

              Comment

              • Armacom AG
                Junior Member
                • May 2014
                • 2

                #8
                how to get arround the problem

                For this issue, i've found a possible solution (which at least worked for me):
                There is a performance gap using mysql and big databases like zabbix may produce.
                Possibly this could be eliminated using mysql > 5.6 (you can configure MySQL so that each table, including its indexes, is stored as a separate file. In that way ibdata1 will not grow as large. This is enabled by default as of version 5.6 of MySQL).

                To get arround the problem, this is what I did:

                stop zabbix-processes
                go to mysql (root account)
                switch to the zabbix database
                do a manual housekeeping:
                delete from alerts where clock<unix_timestamp() - 31*86400; delete from events where clock<unix_timestamp() - 31*86400; delete from history where clock < unix_timestamp() - 31*86400; delete from history_uint where clock < unix_timestamp() - 31*86400; delete from history_str where clock < unix_timestamp() - 31*86400; delete from history_text where clock < unix_timestamp() - 31*86400; delete from history_log where clock < unix_timestamp() - 31*86400; delete from trends where clock < unix_timestamp() - 31*86400; delete from trends_uint where clock < unix_timestamp() - 31*86400;
                exit from mysql
                start the zabbix-processes
                restart zabbix proxy-process on your proxies

                using this method, the contents of the tables listed above within the mysql-statement older than 31 days will be deleted.
                afterwards zabbix will perform very well (like expected).

                Comment

                • ArtemK
                  Senior Member
                  • May 2013
                  • 232

                  #9
                  necroposting, man?

                  my problem was app data file, it's content was never overwritten, but app only were appending text to it. So zabbix_sender every minute was sending 15Mb file to zabbix. And there were >20 such apps. Stupid scripting mistake...that why I had all this crazy problems

                  Possibly this could be eliminated using mysql > 5.6 (you can configure MySQL so that each table, including its indexes, is stored as a separate file. In that way ibdata1 will not grow as large. This is enabled by default as of version 5.6 of MySQL).
                  you mean innodb_file_per_table?

                  Comment

                  • Armacom AG
                    Junior Member
                    • May 2014
                    • 2

                    #10
                    yes innodb_file_per_table according to https://dev.mysql.com/doc/refman/5.0...nsactions.html

                    sorry... necroposting wasn't intent.

                    Comment

                    Working...