Ad Widget

Collapse

Delay in collection of values under "Latest Data"

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • kanchan
    Junior Member
    • Dec 2013
    • 23

    #1

    Delay in collection of values under "Latest Data"

    Hi,

    We are using Zabbix 2.0.9. Please find enclosed setup details for your reference.

    We have large setup and 2 datacenter and have multiple proxies to collect data.
    We have fine tuned mysql database configuration and did table partitioning and fine tuned zabbix configuration as well.

    Right now we are facing below 2 issues :

    1. Sometime we face delay in updating values under Latest Data section. delay is more than 15 min.
    2. I have observed if i configure more than 150 hosts ( around 60-70 items per hosts) under one proxy then we stop getting monitoring data.

    So it will be great if someone help to guide on below points :

    1. Do we need to concentrate on Zabbix server and proxy configuration ? If yes then is there any reference point to fine tune values in configuration files ?
    2. I couldn't find out bench mark documentation for proxies. Is there any number of items limit per proxy ?

    Please let me know if you need any more details from my end to understand my setup.

    Please guide.

    Thanks
    Kanchan
    Attached Files
    Last edited by kanchan; 04-01-2017, 16:00.
  • kanchan
    Junior Member
    • Dec 2013
    • 23

    #2
    It will be great if someone reply. We are facing issue with multiple servers now.

    Comment

    • kanchan
      Junior Member
      • Dec 2013
      • 23

      #3
      I am keep getting below logs when i enabled high debug level at agent end where we are facing delay issue.

      9835:20170106:102548.797 In update_cpustats()
      9835:20170106:102548.798 End of update_cpustats()
      9835:20170106:102549.798 In update_cpustats()
      9835:20170106:102549.798 End of update_cpustats()
      9835:20170106:102550.799 In update_cpustats()
      9835:20170106:102550.799 End of update_cpustats()
      9835:20170106:102551.799 In update_cpustats()
      9835:20170106:102551.800 End of update_cpustats()
      9835:20170106:102552.800 In update_cpustats()
      9835:20170106:102552.800 End of update_cpustats()
      9835:20170106:102553.801 In update_cpustats()
      9835:20170106:102553.801 End of update_cpustats()
      9835:20170106:102554.801 In update_cpustats()



      I searched on forum and found it is unresolved issue :

      https://support.zabbix.com/browse/ZBX-4217


      Can anyone please help to understand meaning of these logs and reason behind it. This is happening only for few clients.

      Comment

      • kanchan
        Junior Member
        • Dec 2013
        • 23

        #4
        What is benchmark for Zabbix proxy

        Above issue we have fixed. One of the template was wrongly configured.

        But we are still facing issue of delay update when we add any new hosts. Once it get updated first time then its able to collect data on time.

        Please confirm is there any limit to configure number of items per zabbix proxy ??
        Is there any benchmark document.

        Comment

        • Pada
          Senior Member
          • Apr 2012
          • 236

          #5
          Are you perhaps using a Zabbix Proxy? By default the Zabbix Proxy updates its configuration once per hour.

          Also, if you're using Zabbix Agent (Active) checks, then take note that the default Zabbix Agent "RefreshActiveChecks" is 120s.

          So worst case you'll have to wait 62 minutes for the active Zabbix Agent item to start sending data...

          In our environment I've changed the "ConfigFrequency" configuration on the proxy to like 5 minutes so that I don't have to wait that long every time I make changes to the hosts that are monitored by the proxy.

          Comment

          • kanchan
            Junior Member
            • Dec 2013
            • 23

            #6
            Thanks Pada for your reply. Appreciate it.
            I will check these configuration parameters and test.

            Do you have any idea how many items we can add per proxy ?

            Comment

            • Pada
              Senior Member
              • Apr 2012
              • 236

              #7
              There is no limit really for the proxy. It all depends on the amount of physical resources (eg. CPU, RAM, HDD) and your configuration (# of pollers, database caching, etc).

              For instance we're currently doing 990 new values per second and from that I'd guess that 850 values per second comes from 2 proxies (with MySQL server running on them too) of ours, which are both m1.small instances on Amazon. They struggled to keep up when they were t1.micro.

              Comment

              • kloczek
                Senior Member
                • Jun 2006
                • 1771

                #8
                Originally posted by Pada
                There is no limit really for the proxy. It all depends on the amount of physical resources (eg. CPU, RAM, HDD) and your configuration (# of pollers, database caching, etc).

                For instance we're currently doing 990 new values per second and from that I'd guess that 850 values per second comes from 2 proxies (with MySQL server running on them too) of ours, which are both m1.small instances on Amazon. They struggled to keep up when they were t1.micro.
                There is one limit here.
                In include/proxy.h you can find line:
                Code:
                #define ZBX_MAX_HRECORDS       1000
                It is max number of metric points which prx is pushing in one cycle of the communication between prx and srv.
                At some point if flow of the data between agents and proxy is enough big or if it is reconnection between server and proxy and server this limits max speed of send data between server and proxy.
                If someone has problem with flow of those data between prx>srv and main database is working on enough DB backend it is possible to increase ZBX_MAX_HRECORDS.
                For example i'm using 50k limit. To big ZBX_MAX_HRECORDS value may cause DB engine chocking. In case increase ZBX_MAX_HRECORDS sometimes is good as well increase HistoryCacheSize on srv side.
                Another way of increase volume of the data flowing from proxies to srv is increase frequency of the communication between srv<>prx (depends on is it active or passive prx it can be changed on srv or prx side).

                PS. IIRC zabbix dev team is considering to make this limit as runtime proxy configurable parameter. Nevertheless so far only way to tweak/change this limit is by change it in source code and recompile.
                Last edited by kloczek; 20-01-2017, 19:41.
                http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                https://kloczek.wordpress.com/
                zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                My zabbix templates https://github.com/kloczek/zabbix-templates

                Comment

                • Pada
                  Senior Member
                  • Apr 2012
                  • 236

                  #9
                  Thank you kloczek! I wasn't aware of that limitation.

                  Comment

                  • kanchan
                    Junior Member
                    • Dec 2013
                    • 23

                    #10
                    How to fine tune Zabbix Server and Proxy Configuration for Large deployments

                    Thanks kloczek for sharing this information. This is very helpful.

                    It will be great if you help me to understand how to fine tune Zabbix server and proxy configuration. I have followed all guide lines which are mentioned in Zabbix performance tuning guide. I am capturing Zabbix server performance matrix as well. But sometimes it is very confusing to set values for Zabbix server and proxy configuration. Is there any logical way or algorithm to set these values ?

                    My current configuration is as below :

                    Number of hosts - 2151
                    Number of items - 175260
                    Number of triggers - 105760
                    NVpS - 1392.56

                    Number of proxies - 18


                    #######################################

                    Zabbix Server configuration file :

                    [root@Zabbix-Server ~]# cat /etc/zabbix/zabbix_server.conf | grep ^[^#]
                    NodeID=2
                    ListenPort=10051
                    LogFile=/var/log/zabbix/zabbix_server.log
                    LogFileSize=150
                    DebugLevel=3
                    DBName=zabbix
                    DBUser=root
                    DBPassword=password
                    StartPollers=40
                    StartPollersUnreachable=10
                    StartTrappers=80
                    StartPingers=15
                    StartHTTPPollers=15
                    CacheSize=512M
                    CacheUpdateFrequency=1800
                    HistoryCacheSize=256M
                    TrendCacheSize=256M
                    Timeout=30
                    AlertScriptsPath=/etc/zabbix/alertscripts
                    ExternalScripts=/etc/zabbix/externalscripts
                    FpingLocation=/usr/sbin/fping
                    LogSlowQueries=10000
                    StartProxyPollers=0
                    [root@Zabbix-Server ~]#


                    #####################################


                    Zabbix proxy configuration file :

                    [root@Zabbix-Proxy01 ~]# cat /etc/zabbix/zabbix_proxy.conf | grep ^[^#]
                    ProxyOfflineBuffer=2
                    ConfigFrequency=300
                    StartPollers=10
                    StartPollersUnreachable=5
                    StartTrappers=10
                    StartPingers=5
                    StartHTTPPollers=5
                    CacheSize=256M
                    HistoryCacheSize=256M
                    Timeout=30
                    ExternalScripts=/etc/zabbix/externalscripts
                    LogSlowQueries=100
                    [root@Zabbix-Proxy01 ~]#

                    ############################

                    Thanks
                    Last edited by kanchan; 24-01-2017, 08:07.

                    Comment

                    • kloczek
                      Senior Member
                      • Jun 2006
                      • 1771

                      #11
                      Originally posted by kanchan
                      Thanks kloczek for sharing this information. This is very helpful.

                      It will be great if you help me to understand how to fine tune Zabbix server and proxy configuration. I have followed all guide lines which are mentioned in Zabbix performance tuning guide. I am capturing Zabbix server performance matrix as well. But sometimes it is very confusing to set values for Zabbix server and proxy configuration. Is there any logical way or algorithm to set these values ?

                      My current configuration is as below :

                      Number of hosts - 2151
                      Number of items - 175260
                      Number of triggers - 105760
                      NVpS - 1392.56
                      First of all: on server number of host does not matter. Only number of items and NVPS with correlation to relative number_of_triggers/number_of_items.
                      Why?
                      If you have well separated hosts monitoring by doing such monitoring only over proxies from what is doing server logical consequences is that for example StartPollersUnreachable can be lowered to 1 because none of he unreachable hosts will be tested over server. The same is with StartPingers.
                      Simple none of such checks will be done over server and all only by proxies.
                      number_of_triggers/number_of_items factor is important because it says something about necessary speed of evaluating new values against triggers definitio. This task is done ONLY on server.

                      Depends on what kind of proxies setup you are using you must trim pollers (they are handling frow of the data from passive proxies) and trappers (active proxies).
                      Usually number of pollers should be not bigger than number of passive proxies and number of trappers can be even less than half of the number of active proxies (yes active agents and proxies setup requires significantly less resources than passive one).

                      On proxies remember that SNMP, IPMI, JMX is working like passive monitoring so number of pollers must be scaled properly with how many SNMP, IPMI, JMX items over exact proxy are monitored over exact proxy.

                      Current zabbix server is really strong on evaluating triggers and zabbix in biggest installations can be scaled event to few millions evaluations/s.
                      It is well scales with number of CPUs on server HW (almost linearly).

                      Your numbers about items, triggers and NVPS are not big so in such cases I can guess that your potential problems are not related to zabbix per se settings but to server DB backend setup.
                      Really listing settings without mentioning more about what kind of problems you are observing is not enough to say more.
                      I can bet that sticking with zabbix 2.0.x in your case may be the biggest issue.
                      Ine mean time zabbix developers done very good job on improving resources consumption, performance and on do upgrade to latest stable you can expect improvements in many aspects by factor few if not more.

                      Try to move ASAP to 3.2.x. Really discussing today any issues on top of zabbix 2.0.x is a bit pointless or may look a bit like talk about zabbix archeology
                      Last edited by kloczek; 24-01-2017, 23:19.
                      http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                      https://kloczek.wordpress.com/
                      zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                      My zabbix templates https://github.com/kloczek/zabbix-templates

                      Comment

                      • kanchan
                        Junior Member
                        • Dec 2013
                        • 23

                        #12
                        Delay in updating latest data first time through active proxy

                        Thanks kloczek for these details. Really appreciate.

                        We are using active proxies and I have recompiled all proxies as per your suggestions and now testing the performance.

                        Yes, we are thinking about version upgrade but it will take some time to implement.

                        In current setup only one issue we are facing that if I add any new host through proxy then to update the value first time under latest data for that host take at least 30 Min. Once it get updated first time then there is no issue for second update. After that it started getting data without any issue.

                        But I am unable to understand why it is taking so long to update data first time under latest data..it is happening for all type of items - active, passive, trapper....

                        It will be great if you help me to understand this.

                        Comment

                        • kloczek
                          Senior Member
                          • Jun 2006
                          • 1771

                          #13
                          Originally posted by kanchan
                          In current setup only one issue we are facing that if I add any new host through proxy then to update the value first time under latest data for that host take at least 30 Min. Once it get updated first time then there is no issue for second update. After that it started getting data without any issue.

                          But I am unable to understand why it is taking so long to update data first time under latest data..it is happening for all type of items - active, passive, trapper....

                          It will be great if you help me to understand this.
                          In case passive proxy in server cfg you have ProxyConfigFrequency. In case active proxy in proxy cfg is ConfigFrequency. In both cases default is 3600s (1h).
                          I suppose that you have here .5h.
                          Until proxy does not have updated cfg about new host it drop all data from hosts or items which does not exist in its cfg copy .. even if server side host already exist or is with new/updated set of items.
                          Even in large zabbix envs I'm using in those params 1 min (or less). As long as server does not need to query DB backend before send updated cfg to proxy decrease configuration frequency usually does not hurt (it uses server in memory cfg data to form proxy reply).
                          http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                          https://kloczek.wordpress.com/
                          zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                          My zabbix templates https://github.com/kloczek/zabbix-templates

                          Comment

                          • kanchan
                            Junior Member
                            • Dec 2013
                            • 23

                            #14
                            Thanks for your reply.

                            Well in my configuration ConfigFrequency=300 sec

                            I have performed below testing :

                            1. I have added new host through active proxy in zabbix server.
                            2. I checked host and item entry in Zabbix proxy mysql database. It was there.
                            3. I sent dummy data via this proxy to trapper item of newly added host.
                            4. I could see data in zabbix proxy mysql db.
                            5. I checked latest data for this entry ..it took more than 30 min to replicate on dashboard.

                            So there is no issue from agent to proxy . But it is taking long time to replicate from proxy to server for FIRST time.
                            If ConfigFrequency is the only relevant parameter for proxy to server communication then proxy should take maximum 5 min to send data first time to server....

                            So I am really not able to understand what it exactly does for 30 min

                            There is nothing in proxy and server logs...
                            Attached Files
                            Last edited by kanchan; 14-02-2017, 08:03.

                            Comment

                            • kloczek
                              Senior Member
                              • Jun 2006
                              • 1771

                              #15
                              Try to do experiment/test:
                              - execute "tail -f zabbix_proxy.log" extracting only lines with notifications that proxy received configuration data from server. Each of those lines at the end have size of those cfg packages in bytes
                              - make any cfg change on one of the hosts behind observed proxy. For example add some new item with new key (even with random key name)
                              - observe zabbix_proxy.log entries how quickly those changes in cfg data size are propagation to the proxy.

                              I remember that on older zabbix I saw in some strange/unknown conditions problems/congestions in process of passing cfg data to the proxy.
                              If you will see something like this you will have probably even more reasons to do ASAP server+proxies upgrade (agents can wait)
                              http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                              https://kloczek.wordpress.com/
                              zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                              My zabbix templates https://github.com/kloczek/zabbix-templates

                              Comment

                              Working...