Ad Widget

Collapse

Zabbix 3.4 will not support more than 50 host, all host are unreachable on dashboard

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mellis
    Senior Member
    • Oct 2017
    • 145

    #1

    Zabbix 3.4 will not support more than 50 host, all host are unreachable on dashboard

    I have upgraded one site to 3.4 and it will not support 50 host, and all host trigger as unreachable.
    If i look in the configuration/host they showthey are responding, meaning i have a green indcator, and I see recent data in the monitoring / latest data. My dashboard shows all host are unavailable.

    My servers are VMs the database server is 4VCpu, 24GB and 200GB disk space. The Zabbix Server / Web Gui server is on the same VM 4vCpu, 24GB and 180GB disk.
    The configuationsync is very busy, avg 44%, the house keeper goes to 100% for 30 mins every hour. i have delete the history tables to 7 days and trends to 90days.
    On the same system the 3.2 did nto have any issues.

    We re wanting roll this out on a ~50 site WAN and need the 3.4 with the mult dashboards so we have can several teams drill down on there servers.

    I can do simple queries and find that the host are avaiable, I am thnking that the dashboard can not update as needed.
    I have been workign this issue for 2 months now and just can only make it worse, never improve the dashboard.

    my.cnf
    [client]
    user = monitoring
    password = monitoring

    [mysqld]
    show_compatibility_56 = ON
    performance_schema
    innodb_buffer_pool_size = 2G
    innodb_data_home_dir=/home/mysql
    innodb_file_per_table = 1
    innodb_buffer_pool_instances = 19
    innodb_buffer_pool_size = 18G
    innodb_page_cleaners=19
    tmp_table_size = 42M
    max_heap_table_size = 42M
    join_buffer_size = 1G
    sort_buffer_size = 4M
    read_rnd_buffer_size = 4M
    query_cache_size = 0
    query_cache_type = 0
    query_cache_limit = 2M
    max_connections = 376
    wait_timeout = 14400
    interactive_timeout = 14400
    datadir=/home/mysql
    socket=/var/lib/mysql/mysql.sock
    symbolic-links=0
    #logging stuff
    log-error=/var/log/mysqld.log
    slow_query_log = 1
    slow_query_log_file = /var/log/slow_quiery.log
    pid-file=/var/run/mysqld/mysqld.pid

    zabbix-server.conf
    LogFileSize=8
    DebugLevel=5
    StartPollers=154
    StartPollersUnreachable=48
    StartTrappers=56
    StartTimers=8
    StartEscalators=10
    MaxHousekeeperDelete=100000
    CacheSize=64M
    StartDBSyncers=24
    HistoryCacheSize=768M
    HistoryIndexCacheSize=96M
    TrendCacheSize=128M
    ValueCacheSize=128M
    Timeout=30




  • mellis
    Senior Member
    • Oct 2017
    • 145

    #2
    I am still showing 100% of my host unreachable? I need help please

    Comment

    • HaveDill
      Senior Member
      • Sep 2014
      • 103

      #3
      Maybe restart httpd? Does /var/www/html/zabbix/conf/zabbix.conf.php contain the proper DB information?

      Comment

      • mellis
        Senior Member
        • Oct 2017
        • 145

        #4
        I have done a restart on the httpd and rebooted that server. I can connect to the database from the server with the mysql -hxx.xx.xx.xx -uxxxx -p and run queries on the database. I wrote a simple php web page to list all the host. that works.
        If I stop all the processes, zabbix-server, httpdand the database on the database server. then restart all services and teh dashboard will clear all triggers for a couple hours.
        Also I see gaps in the graphs that start about when the host startgoing in to unreachable state.

        Using a select for the max(clock) the return is only a sec or so behind.

        Comment

        • HaveDill
          Senior Member
          • Sep 2014
          • 103

          #5
          Can you check your zabbix cache usage % graphs

          Comment

          • mellis
            Senior Member
            • Oct 2017
            • 145

            #6
            Attached is the cache, internal and data gathering for the last 6 hours
            Attached Files

            Comment

            • HaveDill
              Senior Member
              • Sep 2014
              • 103

              #7
              Hm that all looks pretty normal minus the configuration syncer being a little weird.

              Comment

              • mellis
                Senior Member
                • Oct 2017
                • 145

                #8
                yea never could figure that out, one other thing i have see is there are 70857 connections to the database.

                Comment

                • HaveDill
                  Senior Member
                  • Sep 2014
                  • 103

                  #9
                  That sounds lie your problem right there...does your database not have connection limits? are they all coming from zabbix server?

                  Comment

                  • mellis
                    Senior Member
                    • Oct 2017
                    • 145

                    #10
                    All connections are from the one server, i have max_connections set at 376,, the max_used has 314

                    Comment

                    • mellis
                      Senior Member
                      • Oct 2017
                      • 145

                      #11
                      Hello again, I am still having a problem getting the zabbix dahboard to report accurate information. 100% of thge host are reporting unreachable, this is incorrect. I can contatc the host with rdp and telnet to port 10050.
                      When I look at the internal process i see the configuation syncer process bounces to 100% every 2 mins and housekeeping will do to 100% once a hour for 5 or 10 mins. I did nto have this problem with the 3.2, this styarted when I upgraded to 3.4. I have also updated to the 4.0 alpha 6

                      If I restart the zabbix-server the dashboard will recover after ab out 15mins and be correct for an hour or two.

                      I have also found in the mysql.log
                      2018-05-23T12:13:45.823089Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4763ms. The settings might not be optimal. (flushed=1982 and evicted=0, during the time.)
                      2018-05-23T12:20:56.002711Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4515ms. The settings might not be optimal. (flushed=2641 and evicted=0, during the time.)
                      2018-05-23T12:21:26.569105Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4465ms. The settings might not be optimal. (flushed=3219 and evicted=0, during the time.)
                      2018-05-23T12:30:37.846721Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7618ms. The settings might not be optimal. (flushed=954 and evicted=0, during the time.)

                      When I restart the database and zabbix-server at the same time, the dashboard errors seem to start about the same time as the mysql errors start up.

                      I have also seen mysql errors that talk about connection issues
                      2018-05-22T15:51:23.383002Z 175 [Note] Aborted connection 175 to db: 'zabbix' user: 'zabbix' host: 'zabbix32' (Got an error reading communication packets)
                      2018-05-22T15:51:23.383747Z 236 [Note] Aborted connection 236 to db: 'zabbix' user: 'zabbix' host: 'zabbix32' (Got an error reading communication packets)
                      2018-05-22T15:51:23.384806Z 41 [Note] Aborted connection 41 to db: 'zabbix' user: 'zabbix' host: 'zabbix32' (Got an error reading communication packets)

                      Just to recap again:
                      Number of Host 544
                      Number of Items 42529
                      Number of Triggers 6623
                      Required Performance 366.47

                      Web and Zabbix-server is on one VM 4vCPU 24GB ram
                      MySQL server one VM 4vCPU 24GB ram

                      my.cnf
                      show_compatibility_56 = ON
                      performance_schema
                      innodb_buffer_pool_size = 20G
                      innodb_data_home_dir=/home/mysql
                      innodb_file_per_table = 1
                      innodb_buffer_pool_instances = 19
                      innodb_buffer_pool_size = 18G
                      innodb_page_cleaners=64
                      tmp_table_size = 42M
                      max_heap_table_size = 42M
                      max_allowed_packet = 1024M
                      join_buffer_size = 1G
                      sort_buffer_size = 4M
                      read_rnd_buffer_size = 4M
                      query_cache_size = 0
                      query_cache_type = 0
                      query_cache_limit = 2M
                      max_connections = 376
                      wait_timeout = 14400
                      interactive_timeout = 14400
                      datadir=/home/mysql
                      socket=/var/lib/mysql/mysql.sock
                      # Disabling symbolic-links is recommended to prevent assorted security risks
                      symbolic-links=0
                      #logging stuff
                      log-error=/var/log/mysqld.log
                      slow_query_log = 1
                      slow_query_log_file = /var/log/slow_quiery.log
                      pid-file=/var/run/mysqld/mysqld.pid

                      zabbix-server.conf
                      LogFileSize=16
                      DebugLevel=5
                      StartPollers=154
                      StartPollersUnreachable=48
                      StartTrappers=56
                      StartTimers=8
                      StartEscalators=10
                      MaxHousekeeperDelete=100000
                      CacheSize=96M
                      StartDBSyncers=32
                      HistoryCacheSize=768M
                      HistoryIndexCacheSize=96M
                      TrendCacheSize=128M
                      ValueCacheSize=128M
                      Timeout=30

                      Comment

                      Working...