Ad Widget

Collapse

Distributed monitoring with proxies and servers

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • michaill
    Junior Member
    • May 2014
    • 18

    #1

    Distributed monitoring with proxies and servers

    We'll be monitoring four data centers with Zabbix. We'd like to see status of all four DC on one machine, so we'll have Proxies in three DC send data to a Zabbix server in fourth (central) DC. However, if network link to central DC goes down, which occasionally happens, monitoring data will be lost. What to do? We are thinking of installing a Zabbix server in addition to Zabbix proxy in three DC, so each agent will be sending info data to both a server in its DC and to proxy. This would sort of replicate functionality of recently departed distributed monitoring.

    Does this setup make sense? Are there better alternatives?

    Thanks,
    Mike
  • ingus.vilnis
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Mar 2014
    • 908

    #2
    Hi Mike,

    One of the great features of Zabbix proxies is that they can buffer the collected data for up to 720 hours in case the connectivity to Zabbix server is lost. Data will not be lost, only you will not see it until the network comes up again.

    So your initial concept makes perfect sense. I don't think you need to install a server on each DC because you will be unable to manage them all centralized. And in case of lost connectivity you theoretically won't be able to connect there anyways.



    Best Regards,
    Ingus

    Comment

    • maxxer
      Member
      • Oct 2010
      • 80

      #3
      indeed, check the
      Code:
      ProxyOfflineBuffer
      option in proxy configuration file

      Comment

      • tchjts1
        Senior Member
        • May 2008
        • 1605

        #4
        Originally posted by ingus.vilnis
        they can buffer the collected data for up to 720 hours
        Exercise caution with this though. My real life experience a few years ago in 1.8 when our network dropped for several hours was that it took 1 hour to populate 2 hours worth of buffered data from the proxy. So if your network was down for 24 hours, you were looking at 12 hours to get all that missed data. Then on top of that you still didn't have the 12 hours worth of data yet that you just spent getting buffered data. It is a slow catch up process.

        At least it was in 1.8. I don't know if any improvements have been made in the 2.x releases with this, but I keep my buffer only set to 2 hours max.

        Comment

        • ingus.vilnis
          Senior Member
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Mar 2014
          • 908

          #5
          Originally posted by tchjts1
          Exercise caution with this though. My real life experience a few years ago in 1.8 when our network dropped for several hours was that it took 1 hour to populate 2 hours worth of buffered data from the proxy. So if your network was down for 24 hours, you were looking at 12 hours to get all that missed data. Then on top of that you still didn't have the 12 hours worth of data yet that you just spent getting buffered data. It is a slow catch up process.

          At least it was in 1.8. I don't know if any improvements have been made in the 2.x releases with this, but I keep my buffer only set to 2 hours max.
          Interesting experience you had there. Thank you for sharing these tips. What NVPS approx. did your proxy collect when it happened?

          I must do some testing but I tend to believe it is faster now in the current versions. But agreed, the theoretical maximum of 720 hours should not be considered as the suggested way to go for current environment.

          Best Regards,
          Ingus

          Comment

          • tchjts1
            Senior Member
            • May 2008
            • 1605

            #6
            Originally posted by ingus.vilnis
            Interesting experience you had there. Thank you for sharing these tips. What NVPS approx. did your proxy collect when it happened?

            I must do some testing but I tend to believe it is faster now in the current versions. But agreed, the theoretical maximum of 720 hours should not be considered as the suggested way to go for current environment.

            Best Regards,
            Ingus
            Yeah, I knew you weren't suggesting to set it to 720. You were just pointing out the max possible.

            The segment of our network that went down was where our Zabbix App and DB server were installed, so all hosts were affected. We had a dozen proxies, probably at that time around 1,000 hosts and the NVPS would have been around 800 or so.

            So if you lost connection to a proxy only that had like 20 hosts, no big deal on the buffer size, but if you lose connectivity to several hundred hosts... Ouch.

            Comment

            • michaill
              Junior Member
              • May 2014
              • 18

              #7
              Thanks everyone for your feedback!

              Mike

              Comment

              Working...