Ad Widget

Collapse

Excessive number of items Queued

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ptader
    Member
    • Sep 2007
    • 52

    #1

    Excessive number of items Queued

    Zabbix Server 1.8
    Clients 1.8
    mySQL db
    Number of clients: about 1,000 (all linux)

    We just reinstalled all the clients over this past week. After this was completed I noticed about 70 of the nodes were reported as "unreachable". Running a "zabbix_get -s node_name -k system.uptime" works and telneting from the worker node back to the server port works. The client config has the correct hostname and server name.

    What I see happening that troubles me is the very high number of items in the "10 minute queue". In addition, clients that are "reachable" are starting to have stale data (2+ hours old) shown in the latest data page.

    Restarting Zabbix and/or mysql works for a few hours (10 minute queue decreases) but eventually the Items queue up again. I can even run zabbix_get against several of the queued items and get data!

    Debug on the clients show request for information so I tend to believe the Zabbix is getting the data, just not entering it in the database.
    Attached Files
  • elvar
    Senior Member
    • Feb 2008
    • 226

    #2
    Originally posted by ptader
    Zabbix Server 1.8
    Clients 1.8
    mySQL db
    Number of clients: about 1,000 (all linux)

    We just reinstalled all the clients over this past week. After this was completed I noticed about 70 of the nodes were reported as "unreachable". Running a "zabbix_get -s node_name -k system.uptime" works and telneting from the worker node back to the server port works. The client config has the correct hostname and server name.

    What I see happening that troubles me is the very high number of items in the "10 minute queue". In addition, clients that are "reachable" are starting to have stale data (2+ hours old) shown in the latest data page.

    Restarting Zabbix and/or mysql works for a few hours (10 minute queue decreases) but eventually the Items queue up again. I can even run zabbix_get against several of the queued items and get data!

    Debug on the clients show request for information so I tend to believe the Zabbix is getting the data, just not entering it in the database.

    What about the debug zabbix server log? Anything there? Is there anything in mysql-slow.log assuming you are logging slow queries?

    Comment

    • untergeek
      Senior Member
      Zabbix Certified Specialist
      • Jun 2009
      • 512

      #3
      75000 items?!

      You are probably not tuned to handle that. Is your MySQL database local or over a network connection? If it's over a network that could also explain the slowness. What seems to be the problem is that the database isn't writing fast enough to keep up with the data that is coming in. That's a huge amount of traffic.
      Last edited by untergeek; 08-06-2010, 19:20. Reason: Found the items per second.

      Comment

      • untergeek
        Senior Member
        Zabbix Certified Specialist
        • Jun 2009
        • 512

        #4
        One other thing--you may have contention for IDs in the IDS table. This is not uncommon with large installs. Hence the need to tune accordingly.

        Comment

        • ptader
          Member
          • Sep 2007
          • 52

          #5
          Thanks for the responses.

          Nothing shows up in the slow queries log. I also ran Zabbix in debug mode while the "10+ minute queue" was experiencing this problem and I didn't see anything that looked like wrong. I can make the full log available if needed (12 minutes of logging created a 24 Mb, compressed file).

          The database is hosted off local disk. There is significant I/O to the disk but I don't believe it's a bottleneck.

          (sda is the host device for the database)

          Code:
          # iostat -m
          
          avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                     1.20    0.14    0.56   11.83    0.00   86.28
          
          Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
          sda             189.00         0.08         6.92      19305    1686469
          sda1            189.00         0.08         6.92      19305    1686469
          sdb              10.45         0.00         1.81        326     442162
          sdb1             10.45         0.00         1.81        325     442162
          sdb2              0.00         0.00         0.00          0          0


          You mentioned tuning the databases. Any suggestions (changes) to my my.cnf that you would make? One additional note, there is a configured slave mysql server for this mysql database.

          Code:
          [mysqld]
          datadir=/var/lib/mysql
          socket=/var/lib/mysql/mysql.sock
          log-bin = /var/log/mysql/mysql-bin.log
          server-id=1
          expire_logs_days = 5
          user=mysql
          old_passwords=1
          max_connections = 100
          innodb_force_recovery = 0
          
          innodb_data_home_dir =
          innodb_data_file_path=/zabbix/db/ibdata1:32G;/zabbix/db/ibdata2:32G;/zabbix/db/ibdata3:8M:autoextend:max:32G
          
          set-variable = innodb_buffer_pool_size=24G
          set-variable = innodb_additional_mem_pool_size=256M
          key_buffer=512M
          query_cache_type=1
          query_cache_limit=10M
          query_cache_size=64M
          innodb_flush_log_at_trx_commit=2
          innodb_thread_concurrency=8
          innodb_flush_method=O_DIRECT
          max_heap_table_size=2048M
          tmp_table_size=2048M
          max_write_lock_count=1
          
          [mysqld_safe]
          log-error=/var/log/mysqld.log
          pid-file=/var/run/mysqld/mysqld.pid
          log-slow-queries = /var/log/mysql/mysql-slow.log
          log-queries-not-using-indexes

          Comment

          • cwadge
            Junior Member
            • Mar 2010
            • 3

            #6
            I've definitely had more backlog issues with 1.8.x than with any previous versions, so you're not alone in that.

            Regarding MySQL optimization, I'm no DBA, but I find the MySQL Tuning Primer Script is a pretty good jumping off point:



            Make sure you have 'bc' installed before you try and run it, or you'll get a bunch of raw calc strings instead of pretty sums.

            Comment

            • MrKen
              Senior Member
              • Oct 2008
              • 652

              #7
              I'm no DBA either, but, a couple of things I notice:

              You're not using 'innodb_file_per_table', which means that your ibdata files will get bigger and bigger. Maybe you have your own reasons for doing it like this but just in case here's a link about this.

              In mysql, do 'show variables;'
              What value do you have for 'innodb_log_file_size'?

              By default mysql will create two 5MB log files named ib_logfile0 and ib_logfile1. 5MB is way too small! For the amount of RAM that you have, you could probably set this to 2GB.
              [Warning for anyone wanting to change this value: You need to shut down mysql, remove the existing ib_logfiles, add innodb_log_file_size= to your my.cnf, restart mysql, check the mysql.log to see that mysql has started (ignore error messages about 'Incorrect information in file')]

              Hope that is of some help.
              MrKen
              Disclaimer: All of the above is pure speculation.

              Comment

              • ptader
                Member
                • Sep 2007
                • 52

                #8
                innodb_log_file_size is set to 5242880 and I have 2 of them. I think I'll increase this variable.

                I've run a couple of the mysql tuner scripts (mysqltuner.pl http://blog.mysqltuner.com/ and tuning-primer.sh http://www.day32.com/MySQL/ and have made some configuration changes based on their output.

                I deleted and reinstalled the nodes that were listed as "unreachable". That fixed that issue.

                Even though this server has 1000+ nodes on it, it's a relatively simple setup - 3 custom Templates and 2 groups of nodes. I've exported this information and might rebuild the server if I can't resolve this soon.

                Thanks for the continued suggestions.

                Comment

                • MrKen
                  Senior Member
                  • Oct 2008
                  • 652

                  #9
                  Originally posted by ptader
                  innodb_log_file_size is set to 5242880 . . .
                  Looks like the default 5MB.

                  Read this, it helped me a lot to understand how this works. http://mysqldump.azundris.com/archiv....html#extended

                  MrKen
                  Disclaimer: All of the above is pure speculation.

                  Comment

                  • exkg
                    Senior Member
                    Zabbix Certified Trainer
                    Zabbix Certified Specialist
                    • Mar 2007
                    • 718

                    #10
                    My two cents: how many pollers and trappers has you ?

                    []s,
                    Luciano
                    --
                    Luciano Alves
                    www.zabbix.com
                    Brazil | México | Argentina | Colômbia | Chile
                    Zabbix Performance Tuning

                    Comment

                    • fascinatedcow
                      Junior Member
                      • Mar 2010
                      • 20

                      #11
                      Hi,

                      What is your CacheSize variable set to in zabbix_server.conf? Try increasing it.


                      Matt

                      Comment

                      • ptader
                        Member
                        • Sep 2007
                        • 52

                        #12
                        Resolved.

                        After reading the most recent comments (exkg and MrKen)I noticed that StartPollers= was set to 15. Much too low. I increased it to 50 along with recreating the innodb log files to be 512M instead of the default 5M. One of these, or a combination of both resolved the problem. It still took a couple hours for the queue to settle down, but after 6 hours of running the queues are empty except for 37 Items in the 10 minute + column.

                        Thanks everybody for your continued suggestions.

                        There's probably some more tweaks that I can do, but in hopes that it helps someone else, below are my current my.cnf and zabbix_server.conf changes to defaults:

                        Code:
                        [B]# grep -v ^\# /etc/zabbix/zabbix_server.conf [/B]
                        LogFile=/var/log/zabbix_server.log
                        LogFileSize=0
                        DebugLevel=3
                        PidFile=/var/tmp/zabbix_server.pid
                        StartPollers=50
                        StartPollersUnreachable=10
                        HousekeepingFrequency=1
                        DisableHousekeeping=0
                        CacheSize=128M
                        CacheUpdateFrequency=600
                        HistoryCacheSize=16M
                        TrendCacheSize=8M
                        Timeout=30
                        UnreachablePeriod=300

                        Code:
                        [B]# cat /etc/my.cnf[/B]
                        
                        [mysqld]
                        datadir=/var/lib/mysql
                        socket=/var/lib/mysql/mysql.sock
                        log-bin = /var/log/mysql/mysql-bin.log
                        server-id=1
                        expire_logs_days = 5
                        user=mysql
                        old_passwords=1
                        max_connections = 100
                        innodb_force_recovery = 0
                        
                        innodb_data_home_dir =
                        innodb_data_file_path=/zabbix/db/ibdata1:32G;/zabbix/db/ibdata2:32G;/zabbix/db/ibdata3:8M:autoextend:max:32G
                        
                        skip-bdb
                        set-variable = innodb_log_file_size=512M
                        set-variable = innodb_log_buffer_size=32M
                        innodb_buffer_pool_size=22G
                        innodb_additional_mem_pool_size=256M
                        key_buffer=128M
                        query_cache_type=1
                        query_cache_limit=256M
                        query_cache_size=512M
                        innodb_flush_log_at_trx_commit=2
                        innodb_thread_concurrency=8
                        thread_cache_size=8
                        innodb_flush_method=O_DIRECT
                        max_heap_table_size=2048M
                        tmp_table_size=2048M
                        table_cache=256M
                        max_write_lock_count=1
                        sort_buffer_size=8M
                        
                        [mysqld_safe]
                        log-error=/var/log/mysqld.log
                        pid-file=/var/run/mysqld/mysqld.pid
                        log-slow-queries = /var/log/mysql/mysql-slow.log
                        set-variable=long_query_time=5
                        log-queries-not-using-indexes

                        Comment

                        Working...