Ad Widget

Collapse

Proxy data collection when remote network is offline

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • clarkritchie
    Member
    • Aug 2013
    • 46

    #1

    Proxy data collection when remote network is offline

    How resilient is Zabbix's proxy architecture? I think the answer is very, and I am interested in any anecdotal comments about this.

    Our use case is that we need to monitor multiple, large but unrelated networks in remote locations around the globe. These networks sometimes have intermittent Internet connectivity sometimes due to issues with diesel generators, solar setups, etc.

    I noticed 4-5 min outage on a network of ours in northern Uganda yesterday (I happened to be logged in when the network dropped), and I don't see any loss of data (which is great!). SNMP data was being collected every 30 seconds.

    In that example, Zabbix's proxy seems like it was highly effective at caching and forwarding data to the Zabbix server when the proxy's network came back online.

    Is that typical behavior? I hope so. It seems that given a large amount of storage on the proxy, we should be able to withstand a significant outage.

    Any comments appreciated.
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #2
    In my opinion, It's a double-edged sword. It is great for brief outages.

    Maybe not so for long outages. Unless they have made improvements.

    My real-case scenario was monitoring ~ 500 servers and the Zabbix App server went down for several hours. I had the proxies set to hold several hours worth of data if this happened. When everything came back up, the proxies started sending their data, but what I found was that it took 1 hour to get 2 hours worth of data from the proxies into Zabbix.

    So if Zabbix App or DB server went down for 12 hours, you were looking at 6 hours to get to the point of where they came back up... then plus the 6 hours from the time they came up to the current point in time.

    And this was with very high end, big horsepower hardware.

    This was back in early 1.8.x releases.

    At any rate, I now only set my proxies to hold 1 hour's worth of data.

    Comment

    • Alexei
      Founder, CEO
      Zabbix Certified Trainer
      Zabbix Certified SpecialistZabbix Certified Professional
      • Sep 2004
      • 5654

      #3
      The recovery process delays real-time monitoring since Zabbix has to process older data first. I think that the minimum period for buffering 1 hiour is too high. We should allow lower values like 5 minutes or so for speedy recovery.
      Alexei Vladishev
      Creator of Zabbix, Product manager
      New York | Tokyo | Riga
      My Twitter

      Comment

      • gospodin.horoshiy
        Senior Member
        • Sep 2008
        • 272

        #4
        Originally posted by Alexei
        The recovery process delays real-time monitoring since Zabbix has to process older data first. I think that the minimum period for buffering 1 hiour is too high. We should allow lower values like 5 minutes or so for speedy recovery.
        Could this behaviour change in the future?

        Like seperate process that will do history syncing in the foreground , so not preventing latest data to keep coming...
        Zbx 2.0.4 on Debian and MYSQL5 on Ubuntu Server 64bit 8.04,
        200+ Win Agents, 50+ Linux Agents, 150+ Network Devices

        Comment

        • nick0909
          Member
          • Apr 2013
          • 73

          #5
          For our use, the historical data is as important as the current alert information, so I have our proxies set to save up to 2 days worth. We don't want to lose that history, and I am willing to put up with a day delay ingesting all the archived data if that is what it takes.

          If you don't want to pay that penalty than just set your proxies to only save a few hours of data, and your ingest time will be much shorter to catch up. This value is configurable for that reason.

          You can watch a graph of one of the remote system to see it draw the lines and watch it catch up to real time to get an estimate of how fast it is going. In reality when we have had 6-12 hour issues it only takes about 20 minutes to ingest all the data on our Zabbix 2.4 system (~350 hosts, 200nvps). We just upgraded to new hardware and Zabbix 3.2 but haven't had to test the proxy network on that yet, so I don't know if it is improved or not.

          Comment

          • jameskirsop
            Member
            • Jul 2018
            • 32

            #6
            Originally posted by tchjts1
            My real-case scenario was monitoring ~ 500 servers and the Zabbix App server went down for several hours. I had the proxies set to hold several hours worth of data if this happened. When everything came back up, the proxies started sending their data, but what I found was that it took 1 hour to get 2 hours worth of data from the proxies into Zabbix.
            Do you remember what database engine your proxies were using at the time? I'm wondering if the recovery time on Server is related to how quickly the proxies can pull the data from their buffer and if Postgres/MySQL are more performant at that than sqlite...

            Comment

            • tchjts1
              Senior Member
              • May 2008
              • 1605

              #7
              Originally posted by jameskirsop

              Do you remember what database engine your proxies were using at the time? I'm wondering if the recovery time on Server is related to how quickly the proxies can pull the data from their buffer and if Postgres/MySQL are more performant at that than sqlite...
              Yeah, using MySql and all Zabbix servers are on VM's.

              Comment

              Working...