Ad Widget

Collapse

MySQL errors and general headaches.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • lukemacneil
    Junior Member
    • Jun 2011
    • 12

    #1

    MySQL errors and general headaches.

    Hello. I've been working out a large environment configuration for some time now, and I'm about at the end of my rope.

    Currently, I'm monitoring 7500 hosts, 250k items, using one zabbix server and 3 proxies.

    The 3 proxies run mysql community 5.5 and appear to run smoothly.
    The 1 zabbix server runs MySQL 5.5 Percona and .. does not run so smoothly.

    It seems no matter what I tweak, I still see the same issues:

    Query failed: [1213] Deadlock found when trying to get lock; try restarting transaction [update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid']

    Query failed: [1205] Lock wait timeout exceeded; try restarting transaction [update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid']


    Here is some statistical data from mysqltuner:
    [--] Up for: 47m 36s (13M q [4K qps], 640 conn, TX: 16B, RX: 4B)
    [--] Reads / Writes: 85% / 15%
    [--] Total buffers: 16.2G global + 3.0M per thread (400 max threads)
    [OK] Maximum possible memory usage: 17.3G (73% of installed RAM)
    [OK] Slow queries: 0% (172/13M)
    [OK] Highest usage of available connections: 69% (278/400)
    [OK] Key buffer size / total MyISAM indexes: 384.0M/102.0K
    [OK] Key buffer hit rate: 100.0% (23M cached / 3 reads)
    [!!] Query cache efficiency: 0.7% (86K cached / 11M selects)
    [OK] Query cache prunes per day: 0
    [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 2K sorts)
    [OK] Temporary tables created on disk: 0% (74 on disk / 220K total)
    [OK] Thread cache hit rate: 56% (278 created / 640 connections)
    [OK] Table cache hit rate: 96% (221 open / 228 opened)
    [OK] Open file limit used: 2% (48/2K)
    [OK] Table locks acquired immediately: 100% (55M immediate / 55M locks)
    [!!] Connections aborted: 8%
    [!!] InnoDB data size / buffer pool: 20.2G/15.6G


    The connections aborted worries me.

    Here is my my.cnf

    innodb_file_per_table
    innodb_file_format=barracuda
    skip-external-locking
    key_buffer_size = 384M
    max_allowed_packet = 32M
    max_connections=400
    join_buffer_size=256k
    read_buffer_size=256k
    read_rnd_buffer_size=256k
    table_open_cache = 512
    sort_buffer_size = 2M
    myisam_sort_buffer_size = 64M
    thread_cache_size=384
    query_cache_limit=1M
    query_cache_size = 128M
    thread_concurrency = 16
    innodb_data_home_dir = /datastore/zabbix
    innodb_data_file_path = ibdata1:10M:autoextend
    innodb_log_group_home_dir = /datastore/zabbix
    innodb_buffer_pool_size = 16000M
    innodb_flush_method=O_DIRECT
    innodb_flush_log_at_trx_commit = 2

    Does anyone see anything I can do to stabilize this more?
  • lukemacneil
    Junior Member
    • Jun 2011
    • 12

    #2
    Also, after adding some nodata() functions to my triggers, you can see here my timer processes spiked up to near consistant 100%.



    Since the CPU on all of these boxes is severely underutilized, is there a way to assign more timer processes?

    Is there anywhere, in any documentation that describes exactly what the zabbix internal processes do, and how they do it? Like.. is a DBSyncer the same thing as a History Syncer?

    Comment

    • lukemacneil
      Junior Member
      • Jun 2011
      • 12

      #3
      Also..

      Id User Host/IP DB Time Cmd Query or State
      -- ---- ------- -- ---- --- ----------
      274 zabbix localhost zabbix 208 Query update triggers set value=0,lastchange=1311348666,error='' where triggerid=1061
      275 zabbix localhost zabbix 193 Query update triggers set value=0,lastchange=1311348678,error='' where triggerid=1065
      278 zabbix localhost zabbix 193 Query update triggers set value=0,lastchange=1311348675,error='' where triggerid=1173
      280 zabbix localhost zabbix 184 Query update triggers set value=0,lastchange=1311348685,error='' where triggerid=1177
      277 zabbix localhost zabbix 183 Query update triggers set value=0,lastchange=1311348689,error='' where triggerid=1069
      282 zabbix localhost zabbix 182 Query update triggers set value=0,lastchange=1311348693,error='' where triggerid=1289
      279 zabbix localhost zabbix 181 Query update triggers set value=0,lastchange=1311348696,error='' where triggerid=1181
      276 zabbix localhost zabbix 177 Query update triggers set value=0,lastchange=1311348700,error='' where triggerid=1073
      273 zabbix localhost zabbix 145 Query update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid
      281 zabbix localhost zabbix 139 Query update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid
      270 zabbix localhost zabbix 0 Query select h.value from history_log as h, items as i where i.itemid=228801 and h.itemid=i.itemid an
      299 root localhost zabbix 0 Query show full processlist

      Comment

      • Jason
        Senior Member
        • Nov 2007
        • 430

        #4
        Have you tried enabling the slow query log to see what queries are causing all the problems?

        Comment

        • jonh
          Junior Member
          • Aug 2010
          • 8

          #5
          Also could you describe the hardware you're running on, especially storage?

          Comment

          • lukemacneil
            Junior Member
            • Jun 2011
            • 12

            #6
            I'm using an external dedicated fiber attached raid 10 array split into 4 luns, each lun assigned to each box. They're all 15k sas drives, if I recall correctly.

            There are 2 24 core 24G 3.0ghz blades and 2 quad core 8GB blades.

            These issues seemed to have resolved themselves due to my 'ignore it and it will go away' philosophy, though, I have no idea why. Timer is idling now around 8%.

            Because Percona was leaking memory and ultimately crashing every day or so, I revered to mysql 5.5 community, which seems to have stabilized everything.

            Comment

            • Alexei
              Founder, CEO
              Zabbix Certified Trainer
              Zabbix Certified SpecialistZabbix Certified Professional
              • Sep 2004
              • 5654

              #7
              I am curious what level of performance you are getting from Zabbix on this hardware, how many new values per second?
              Alexei Vladishev
              Creator of Zabbix, Product manager
              New York | Tokyo | Riga
              My Twitter

              Comment

              • lukemacneil
                Junior Member
                • Jun 2011
                • 12

                #8
                Well, today it looks like so:

                top - 11:47:01 up 40 days, 12:59, 3 users, load average: 0.80, 0.83, 0.91
                Tasks: 344 total, 2 running, 342 sleeping, 0 stopped, 0 zombie
                Cpu(s): 3.0%us, 0.2%sy, 0.4%ni, 95.4%id, 0.9%wa, 0.0%hi, 0.0%si, 0.0%st
                Mem: 24022M total, 22084M used, 1937M free, 92M buffers
                Swap: 2047M total, 10M used, 2036M free, 5238M
                cached

                random iostat:
                Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
                dm-0 16.26 24.68 1832.62 86470269 6419604396




                I'm using 3 proxies for data gathering, the zabbix server itself only gathers for a handful (8) of infrastructure servers. I keep the intervals high, around 300 seconds, some 900 seconds, some even longer.


                Comment

                Working...