Ad Widget

Collapse

Zabbix Server Queue & Config

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • dvwyngaa
    Member
    • Mar 2014
    • 49

    #1

    Zabbix Server Queue & Config

    Hi,

    I have started working with Zabbix as an alternative to the commercial monitoring tools out there. I have installed the Server, Proxy, DB & Portal on seperate vm's.

    The server has four proxies connected to it, but the queue seems to be growing steadily. It seems as if the server can't manage the items sent to it. I need some advice in the following:
    1) How does data flow from the host that is monitored to ultimately where it lands up in the DB. Too understand how Zabbix works would help me fault find it.
    2) Is my config for the server, proxy and database are correct? If not, what can I change it to so that the queue is lessened and the server is "faster"?
    3) Some of my graphs, especially calculated items in a graph, have missing spaces or blanks in it. Why?
    4) Trigger take a long time to send alerts via sms/email e.g. I will stop an agent and then only after 20-30minutes will I get an alert via email/sms, sometimes it takes hours.

    Config:

    VMWare Server:
    Dual Xeon CPU, 32GB memory, 3 x 1TB SATA drives in RAID5 config. This is the main server which has VMWare ESXi 5.1 loaded on it.

    Individual VM's:
    - MYSQL Database:
    - 16GB Memory
    - 500GB HD
    - 2 x vCPU
    - Database is not partitioned
    - O/S: CentOS 6.5 64-bit
    - MySQL 5.1 64-bit

    - Zabbix Server:
    - 4GB Memory
    - 2 x vCPU
    - 60GB HD
    - O/S: CentOS 6.5 64-bit

    - Zabbix Proxy:
    - 2GB Memory
    - 2 x vCPU
    - 40GB HD
    - O/S: CentOS 6.5 64-bit
    - MySQL 5.1 64-bit

    - Zabbix Portal:
    - 2GB Memory
    - 1 x vCPU
    - 40GB HD
    - Apache 2.2
    - php5.3 with php-xcache

    MySQL Server my.cnf:
    Code:
    [mysqld]
    bind-address=192.168.11.3
    datadir=/app/mysql
    #/var/lib/mysql
    socket=/app/mysql/mysql.sock
    #/var/lib/mysql/mysql.sock
    user=mysql
    # Disabling symbolic-links is recommended to prevent assorted security risks
    symbolic-links=0
    
    tmpdir=/tmp
    
    # Custom Settings
    log_queries_not_using_indexes=1
    
    # GENERAL #
    default-storage-engine         = InnoDB
    
    # MyISAM #
    key-buffer-size                = 20M
    myisam-recover                 = FORCE,BACKUP
    
    # SAFETY #
    max-allowed-packet             = 64M
    max-connect-errors             = 1000000
    innodb                         = FORCE
    
    # BINARY LOGGING #
    log-bin                        = /app/mysql/mysql-bin
    expire-logs-days               = 14
    sync-binlog                    = 1
    
    # CACHES AND LIMITS #
    tmp-table-size                 = 128M #32M
    max-heap-table-size            = 128M #32M
    query-cache-type               = 1
    query-cache-size               = 128M #16M
    query-cache-limit              = 128M
    max-connections                = 500
    thread-cache-size              = 300
    open-files-limit               = 65535
    table-definition-cache         = 4096
    table-open-cache               = 4096
    table-cache                    = 512 #new
    join-buffer-size               = 4M #2
    read-buffer-size               = 512k #new
    read-rnd-buffer-size           = 512k #new
    
    # INNODB #
    innodb-flush-method            = O_DIRECT
    innodb-log-files-in-group      = 2
    innodb-log-file-size           = 256M
    innodb-flush-log-at-trx-commit = 2 #1
    innodb-file-per-table          = 1
    innodb-buffer-pool-size        = 12G
    innodb-log-buffer-size         = 4M
    innodb-thread-concurrency      = 0 #16
    
    # LOGGING #
    log-error                      = /app/mysql/mysql-error.log
    log-queries-not-using-indexes  = 1
    long_query_time                = 1
    slow-query-log                 = 1
    Zabbix Server Config:
    Code:
    DBSocket=/var/lib/mysql/mysql.sock
    StartPollers=80
    StartIPMIPollers=5
    StartPollersUnreachable=5
    StartTrappers=40
    StartPingers=5
    StartDiscoverers=5
    StartHTTPPollers=10
    StartTimers=15
    StartVMwareCollectors=5
    VMwareFrequency=60
    VMwareCacheSize=8M
    HousekeepingFrequency=1
    MaxHousekeeperDelete=500
    SenderFrequency=15
    CacheSize=256M
    CacheUpdateFrequency=60
    StartDBSyncers=16
    HistoryCacheSize=128M
    TrendCacheSize=64M
    HistoryTextCacheSize=64M
    ValueCacheSize=64M
    Timeout=5
    TrapperTimeout=120
    UnreachablePeriod=45
    UnavailableDelay=60
    UnreachableDelay=15
    Zabbix Proxy Config:

    Code:
    ProxyMode=0
    DBSocket=/app/mysql/mysql.sock
    HeartbeatFrequency=60
    ConfigFrequency=300
    DataSenderFrequency=5
    StartPollers=30
    StartPollersUnreachable=5
    StartTrappers=15
    StartPingers=10
    StartDiscoverers=5
    StartHTTPPollers=5
    HousekeepingFrequency=1
    CacheSize=16M
    StartDBSyncers=8
    HistoryCacheSize=32M
    HistoryTextCacheSize=16M
    Timeout=15
    TrapperTimeout=120
    UnreachablePeriod=45
    UnavailableDelay=60
    UnreachableDelay=15
    Zabbix Proxy my.cnf:

    Code:
    [mysqld]
    # General #
    datadir                         = /app/mysql
    socket                          = /app/mysql/mysql.sock
    user                            = mysql
    symbolic-links                  = 0
    #default-storage-engine=InnoDB
    
    interactive_timeout             = 12000
    #wait_timeout=300
    
    # Custom MYSQL settings for Zabbix
    
    query_cache_size                = 32M
    query_cache_type                = 1
    query_cache_limit               = 32M
    thread_cache_size               = 128
    table_cache                     = 512
    max_connections                 = 500
    wait_timeout                    = 600
    key_buffer_size                 = 10M
    innodb_buffer_pool_size         = 16M
    slow-query-log                  = 1
    slow-query-log-file             = /app/mysql/mysql-slow.log
    join_buffer_size                = 512K
    table_cache                     = 128
    long_query_time                 = 2
    
    innodb-flush-method            = O_DIRECT
    innodb-log-files-in-group      = 2
    innodb-log-file-size           = 128M
    innodb-log-buffer-size         = 4M
    innodb-flush-log-at-trx-commit = 1
    innodb-file-per-table          = 1
    innodb-buffer-pool-size        = 512M
    innodb-thread-concurrency      = 8
    
    [mysqld_safe]
    log-error                       = /var/log/mysqld.log
    pid-file                        = /app/mysql/mysqld.pid
    Some general info:
    - The Proxy server is a remote proxy
    - The Proxy server monitors SNMP devices, agents and Web Response Time for a couple for servers
    - The Server has a total of 102 hosts, 2353 items and 740 triggers and has a nvps of 29.48

    See attached screenshots for the cache, queue, busy process, etc.

    I would realy appreciate any input from the guru's and developers of Zabbix as to where I'm going wrong here.

    We intend to add 3000+ hosts in the near future, but if the systems seems a bit under pressure from 102 hosts then I need to re-evaluate. In all fairness, it all might be my misunderstanding of the settings in the config files.

    Your help will be appreciated.

    Regards,

    Dawid
    Attached Files
    Last edited by dvwyngaa; 28-08-2014, 14:18.
  • WilliamSG
    Member
    • Jun 2014
    • 41

    #2
    What version are you running?

    Your environment and your setup are fine...

    At Queue viewer, change to Details and check if the itens are delayed for 44years, and scheduled check is 01-01-1970, if yes, you're with same error than I.

    It's a known issue, and was solved in the new released version, 2.2.6, released yesterday. Check this link for details.

    Unfortunely, I'm having problems to update to the new version....

    Comment

    • dvwyngaa
      Member
      • Mar 2014
      • 49

      #3
      William,

      My apologies, the Zabbix version is 2.2.5 and CentOS has been updated to latest patches via Yum.

      The queue of items....I only have a couple of items with a 1970 year attached to it...

      This is not the case with the rest of the items, so I'm assuming I still have a problem with a config file causing the queue build up and blanks in graphs and delayed alerts?

      Regards,

      Dawid

      Comment

      • ingus.vilnis
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Mar 2014
        • 908

        #4
        Originally posted by dvwyngaa
        William,

        My apologies, the Zabbix version is 2.2.5 and CentOS has been updated to latest patches via Yum.

        The queue of items....I only have a couple of items with a 1970 year attached to it...

        This is not the case with the rest of the items, so I'm assuming I still have a problem with a config file causing the queue build up and blanks in graphs and delayed alerts?

        Regards,

        Dawid
        Hi Dawid,

        I spotted two things so far:
        1. as William told - do upgrade to 2.2.6 which should fix incorrect queue calculation
        2. in MySQL config my.cnf you have extremely low wait_timeout = 600 which is far more less than the default 28800 seconds (8 hours). Please check your zabbix_server.log whether you don't have any "MySQL server has gone away" type errors.


        In any case, check the logs. Plenty of useful information about performance issues can be found there.

        Best Regards,
        Ingus

        Comment

        • dvwyngaa
          Member
          • Mar 2014
          • 49

          #5
          Ingus,

          Thanks...although I was not getting any "connection to database lost" I have increased the value to the mentioned efault. I will keep an eye on it.

          Once the issue with the repo has been resolved, I'll update to 2.2.6 and report back.

          Dawid

          Comment

          • dvwyngaa
            Member
            • Mar 2014
            • 49

            #6
            Ingus,

            I have changed the "wait_timeout" in MySQL on the proxy about 10 hours ago, but that doesn't seem to have any impact.

            I have also managed to update to 2.2.6 this morning and some of backlog items have been cleared (on our one test system). I will give it a couple of hours and see if all items have been cleared.

            Regards,

            Dawid

            Comment

            • dvwyngaa
              Member
              • Mar 2014
              • 49

              #7
              Ingus,

              OK, so I have let the new config run for a couple of hours and I still have a bit of queue build up and my calculated graph is still full of gaps. See attached queue build up and "gapped" graph for reference:




              Any ideas why the queue build up and graph is the way it is?

              Regards,

              Dawid

              Comment

              • ingus.vilnis
                Senior Member
                Zabbix Certified Trainer
                Zabbix Certified SpecialistZabbix Certified Professional
                • Mar 2014
                • 908

                #8
                Hi Dawid,

                Hard for me to tell what is wrong on your system.

                Zabbix_server logs could tell you more.
                Additionally please check the performance graphs and configuration files for your proxies as well.

                And one more thing but I don't think it could affect your system that much - you have pretty outdated MySQL version 5.1 when there is much newer 5.5 available.

                Best Regards,
                Ingus

                Comment

                • dvwyngaa
                  Member
                  • Mar 2014
                  • 49

                  #9
                  Hi Ingus,

                  Thanks for the reply. I have had a look at the server log and there is no error but for the normal config data that is sent to proxy servers.

                  The proxy server performance screens are as follows:







                  Regards,

                  Dawid

                  Comment

                  • ingus.vilnis
                    Senior Member
                    Zabbix Certified Trainer
                    Zabbix Certified SpecialistZabbix Certified Professional
                    • Mar 2014
                    • 908

                    #10
                    Hmmm, strange.
                    Proxy graphs are normal as well. However would be very useful to compare them for the very exact time period (from 12:00). You attached proxy graphs from 13:07. Maybe the hour before it was all bad there as well?

                    Another thing. Check zabbix_server.conf and enable LogSlowQueries= option. Set it to something like 3000 milliseconds. Maybe your system is doing some ultrahard queries to database.

                    Check MySQL logs for errors as well.

                    Best Regards,
                    Ingus

                    Comment

                    • dvwyngaa
                      Member
                      • Mar 2014
                      • 49

                      #11
                      Ingus,

                      Yip, my SlowQueries parameter has been set & activated, but nothing coming out of that.

                      I have upgrade the MySQL DB from 5.1 to 5.5 for the proxy. I will monitor it and let you know.

                      Regards,

                      Dawid

                      Comment

                      • ingus.vilnis
                        Senior Member
                        Zabbix Certified Trainer
                        Zabbix Certified SpecialistZabbix Certified Professional
                        • Mar 2014
                        • 908

                        #12
                        Ok, let's see how it goes, but don't forget to update MySQL on the server as well (which is the most important part actually).

                        Best Regards,
                        Ingus

                        Comment

                        • dvwyngaa
                          Member
                          • Mar 2014
                          • 49

                          #13
                          Ingus,

                          Thanks. I have upgraded MySQL 5.1 to 5.5 on my proxy and database. NO real affect after 24 hours.

                          On my Proxy I see no slow queries, but....on my database server, although there is very little cpu load, I get lots of slow queries. My sloq query parameter in my.cnf is as follows:

                          Code:
                          # LOGGING #
                          log-error                              = /app/mysql/mysql-error.log
                          log-queries-not-using-indexes  = 1
                          long_query_time                    = 1
                          slow-query-log                      = 1
                          slow-query-log-file                 = /app/mysql/mysql-slow.log
                          and the slow log file output:

                          Code:
                          SET timestamp=1409376056;
                          select h.hostid,h.host,h.name,t.httptestid,t.name,t.variables,t.agent,t.authentication,t.http_user,t.http_password,t.http_proxy,t.retries from httptest t,hosts h where t.hostid=h.hostid and t.nextcheck<=1409376061 and mod(t.httptestid,10)=0 and t.status=0 and h.proxy_hostid is null and h.status=0 and (h.maintenance_status=0 or h.maintenance_type=0);
                          # User@Host: zabbix[zabbix] @ prodsamzmgt1 [192.168.11.4]
                          # Query_time: 0.000289  Lock_time: 0.000084 Rows_sent: 1  Rows_examined: 11
                          SET timestamp=1409376056;
                          select min(t.nextcheck) from httptest t,hosts h where t.hostid=h.hostid and mod(t.httptestid,10)=6 and t.status=0 and h.proxy_hostid is null and h.status=0 and (h.maintenance_status=0 or h.maintenance_type=0);
                          # User@Host: zabbix[zabbix] @ prodsamzmgt1 [192.168.11.4]
                          # Query_time: 0.000266  Lock_time: 0.000063 Rows_sent: 1  Rows_examined: 11
                          SET timestamp=1409376056;
                          select min(t.nextcheck) from httptest t,hosts h where t.hostid=h.hostid and mod(t.httptestid,10)=0 and t.status=0 and h.proxy_hostid is null and h.status=0 and (h.maintenance_status=0 or h.maintenance_type=0);
                          I might be getting confused here, but my slow log query time is 1s in the my.cnf, but the log file shows query times of 0.000266s.

                          Why is the log file showing these queries as slow if the query time is not?

                          If this is normal, then what other settings should I be looking at on MySQL to improve the slow queries?

                          Regards,

                          Dawid

                          Comment

                          • dvwyngaa
                            Member
                            • Mar 2014
                            • 49

                            #14
                            Ingus,

                            Forgot to add my MySQL Stats from the mysql-tuner script. Here it is:

                            Code:
                            >>  MySQLTuner 1.3.0 - Major Hayden <[email protected]>
                             >>  Bug reports, feature requests, and downloads at http://mysqltuner.com/
                             >>  Run with '--help' for additional options and output filtering
                            [!!] Successfully authenticated with no password - SECURITY RISK!
                            [OK] Currently running supported MySQL version 5.5.39-log
                            [OK] Operating on 64-bit architecture
                            
                            -------- Storage Engine Statistics -------------------------------------------
                            [--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MRG_MYISAM 
                            [--] Data in MyISAM tables: 5K (Tables: 4)
                            [--] Data in InnoDB tables: 2G (Tables: 108)
                            [--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17)
                            [!!] Total fragmented tables: 11
                            
                            -------- Security Recommendations  -------------------------------------------
                            [OK] All database users have passwords assigned
                            
                            -------- Performance Metrics -------------------------------------------------
                            [--] Up for: 2h 47m 11s (112K q [11.239 qps], 3K conn, TX: 134M, RX: 25M)
                            [--] Reads / Writes: 74% / 26%
                            [--] Total buffers: 12.5G global + 7.2M per thread (500 max threads)
                            [!!] Maximum possible memory usage: 16.1G (103% of installed RAM)
                            [!!] Slow queries: 35% (40K/112K)
                            [OK] Highest usage of available connections: 17% (87/500)
                            [OK] Key buffer size / total MyISAM indexes: 8.0M/105.0K
                            [OK] Key buffer hit rate: 100.0% (112 cached / 0 reads)
                            [OK] Query cache efficiency: 23.0% (16K cached / 70K selects)
                            [OK] Query cache prunes per day: 0
                            [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 275 sorts)
                            [OK] Temporary tables created on disk: 6% (268 on disk / 4K total)
                            [OK] Thread cache hit rate: 97% (87 created / 3K connections)
                            [OK] Table cache hit rate: 96% (174 open / 181 opened)
                            [OK] Open file limit used: 0% (59/65K)
                            [OK] Table locks acquired immediately: 100% (116K immediate / 116K locks)
                            [OK] InnoDB buffer pool / data size: 12.0G/2.3G
                            [OK] InnoDB log waits: 0
                            -------- Recommendations -----------------------------------------------------
                            General recommendations:
                                Run OPTIMIZE TABLE to defragment tables for better performance
                                MySQL started within last 24 hours - recommendations may be inaccurate
                                Reduce your overall MySQL memory footprint for system stability
                            My worry here is the "Slow Queries" being at 35%

                            Dawid

                            Comment

                            • kargh
                              Junior Member
                              • Feb 2014
                              • 21

                              #15
                              As posted somewhere in the documentation, I learned a heck of a lot about how my system was running using this command:
                              Code:
                              watch -n 0.2 ps -fu zabbix
                              Not sure if it'll help you but it might provide some insight into how the pollers are working for your setup.

                              Comment

                              Working...