Ad Widget

Collapse

High queue in zabbix server (Problem)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Nailio
    Junior Member
    • Jun 2018
    • 7

    #1

    High queue in zabbix server (Problem)

    Hi to everyone!

    I am rolling out a Zabbix for large environment according to company request, and got stuck on a point with high zabbix queue on a server.
    For testing purposes I installed server & Mariadb on the same server (VM), however later on i am planning to separate them with HA database. Currently, all hosts are collected by single proxy (as a VM) with the same specs as server has.

    The server specifications i am using are...

    OS: CentOS 7 (64bti)
    CPUs: 8
    RAM: 8GB
    Disk Provisioned Size: 200 GB SSD
    Zabbix server is running Yes xxxx
    Number of hosts (enabled/disabled/templates) 337 256 / 0 / 81
    Number of items (enabled/disabled/not supported) 100729 98815 / 1 / 1913
    Number of triggers (enabled/disabled [problem/ok]) 1893 1893 / 0 [19 / 1874]
    Number of users (online) 2 1
    Required server performance, new values per second 80.94
    Zabbix proxy collects data every 5m, so at those peaks there are ~600 NVPS, however zabbix server queue reaches around 50k values, so cannot find out what could be the problem..
    I have another zabbix server that collects 60% of those items without proxy, and it works fine without any queue. I checked NTP sync, there is no problem on that side as well.

    What was the highest NVPS collected by single proxy in your practice?

    Hope for a help or any directions from this community...

    Below I pasted my zabbix_xx.conf
    Server Proxy Mysql
    LogFile=/var/log/zabbix/zabbix_server.log Server=xxxx [mysqld_safe]
    LogFileSize=0 Hostname=zabbix-proxy-1 log-error=/var/log/mariadb/mariadb.log
    PidFile=/var/run/zabbix/zabbix_server.pid LogFile=/var/log/zabbix/zabbix_proxy.log pid-file=/var/run/mariadb/mariadb.pid
    SocketDir=/var/run/zabbix LogFileSize=0 [mysqld]
    DBName=zabbix PidFile=/var/run/zabbix/zabbix_proxy.pid
    DBUser=zabbix SocketDir=/var/run/zabbix
    DBPassword=pass DBName=zabbix_proxy
    StartPollers=32 DBUser=zabbix large-pages
    StartPollersUnreachable=32 DBPassword=xxxx binlog-row-event-max-size= 8192
    SNMPTrapperFile=/var/log/snmptrap/snmptrap.log StartPollers=500 binlog-format = MIXED
    HousekeepingFrequency=1 StartPollersUnreachable=400 character_set_server= utf8
    MaxHousekeeperDelete=500000 SNMPTrapperFile=/var/log/snmptrap/snmptrap.log collation_server = utf8_bin
    CacheSize=2G HousekeepingFrequency=1 expire_logs_days = 1
    StartDBSyncers=30 CacheSize=1G join_buffer_size = 262144
    HistoryCacheSize=2G StartDBSyncers=16 max_allowed_packet= 32M
    HistoryIndexCacheSize=2G HistoryCacheSize=1G max_connect_errors = 10000
    TrendCacheSize=2G Timeout=30 max_connections = 1500
    ValueCacheSize=2G ExternalScripts=/usr/lib/zabbix/externalscripts max_heap_table_size= 134217728
    Timeout=25 LogSlowQueries=3000 query_cache_size = 256M
    AlertScriptsPath=/usr/lib/zabbix/alertscripts table_open_cache = 2048
    ExternalScripts=/usr/lib/zabbix/externalscripts thread_cache_size = 64
    LogSlowQueries=3000 wait_timeout= 86400

    Thanks in advance!
  • kernbug
    Senior Member
    • Feb 2013
    • 330

    #2
    Originally posted by Nailio
    Hi to everyone!

    StartDBSyncers=30
    Thanks in advance!
    Please, give us server and proxy performance graphs, also output from the iostat -d -x -h 2

    Comment

    • Nailio
      Junior Member
      • Jun 2018
      • 7

      #3
      Here you are...

      On the second graph (zabbix-server 12h) before getting a queue, all items were collecting by server itself.
      Attached Files
      Last edited by Nailio; 27-06-2018, 12:57.

      Comment

      • kernbug
        Senior Member
        • Feb 2013
        • 330

        #4
        Originally posted by Nailio
        Here you are...

        On the second graph (zabbix-server 12h) before getting a queue, all items were collecting by server itself.
        Not Zabbix queue graph, but internal process performance graphs, please!

        Comment

        • Nailio
          Junior Member
          • Jun 2018
          • 7

          #5
          I attached data gatherig process graphs as well. tq
          Attached Files

          Comment

          • kernbug
            Senior Member
            • Feb 2013
            • 330

            #6
            Originally posted by Nailio
            I attached data gatherig process graphs as well. tq
            Hi,

            This pikes on Zabbix Proxy graph are strange enough, only passive checks? Why do you need 500 pollers? Mostly 100-150 is enough and start with StartDBSyncers=4 for proxy.

            Comment


            • Nailio
              Nailio commented
              Editing a comment
              Yeah, thats why I got stuck... Well, the reason why I increased number of StartPollers and StartDBSyncers this high is because busy poller processes were growing (upto 100%) along with numbers of items added. Today we made some changes in proxy default parameters ZBX_MAX_HRECORDS, that limits size of the proxy batch of the monitoring data. So lets see how helpfull will it be..
          • zdenek.spichal
            Junior Member
            • Apr 2018
            • 4

            #7
            Ahoj
            I have a some questions here.
            I see that you monitor 256 hosts and cca 400 items on each of them.
            (there is more than 100000 items if I can see well)
            Do you really need so many of items on each host?
            What about think about time for collecting data - do you need all of them every 5min?
            Next the number of pollers, and other setup.
            A lot of not supported items is not so good too.
            Zdenek

            Comment

            • hugo.jose
              Junior Member
              • Jul 2018
              • 10

              #8
              Sorry about my delay I was off sometime... I will clean.. and return this point latter...
              I give some news after some cleaning

              Comment

              • kloczek
                Senior Member
                • Jun 2006
                • 1771

                #9
                StartPollersUnreachable=400
                StartPollers=500

                Yep you need to start thinking about move away from use passive agent monitoring and move to use active one.
                http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                https://kloczek.wordpress.com/
                zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                My zabbix templates https://github.com/kloczek/zabbix-templates

                Comment


                • Nailio
                  Nailio commented
                  Editing a comment
                  Hi, thanks for reply! I've been using only active proxies since the begining!

                • nobre255
                  nobre255 commented
                  Editing a comment
                  when i used this configuration my zabbix server stopped.
              • Nailio
                Junior Member
                • Jun 2018
                • 7

                #10
                Well, i do understand that number of items on my proxies are high, but my question is about something else...
                1. I have a zabbix server that collects all items by itself (without proxy)
                2. I have another zabbix with proxy to collect the same hosts

                Both of them have busy pollers (config on server and proxy are the same) but first one doesnt have high queue compared to the one with proxy, does anyone know what could be the problem?

                Attached Files

                Comment

                • kloczek
                  Senior Member
                  • Jun 2006
                  • 1771

                  #11
                  Active monitoring is not about use active proxy.
                  This is about use active agent setup and "zabbix (active) agent" instead "zabbix agent" item types.

                  Generally problem is that passive monitoring does not scale well with growing number of items and NVPS per proxy/serer.
                  In other words above some scale of the monitoring you will be forced to use only active agent setup.
                  http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                  https://kloczek.wordpress.com/
                  zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                  My zabbix templates https://github.com/kloczek/zabbix-templates

                  Comment

                  Working...