Ad Widget

Collapse

Zabbix 3 tiered Configuration challenges

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • five0va
    Junior Member
    • Mar 2015
    • 26

    #1

    Zabbix 3 tiered Configuration challenges

    Hello all! I'm doing something a tad different. I have 3 Proxies behind Keepalived with the idea that I'm going to be moving this into a horizontal scaling methodology and utilize Docker. Leading up to this, I have these setup in VMware with an agent pointing to the VRRP address (DNS resolution is working). I tried setting this up with just active monitoring, but the only data I could ever get from the end agent was Agent version and hostname of the server, no other data was sent, so I went back to a passive setup to get data.

    My setup is as follows a GaleraDB cluster, 2 Zabbix App server, 3 Proxies and a single agent for right now, trying to get this working properly, with auto registration.

    I've been going through config changes for the past week, but nothing seems to hit the sweet spot. I can't get auto-host registration to work (worked just fine on our old Agent to Server configuration) either. I hope someone can lend a fresh set of eyes on this for me. One of the issues right now is that the server is reporting that prx-03's agent is unreachable - but I source check all my configs and push everything using clustershell (clush), so the agent configs are exactly the same and prx-01 and 02 are not being reported as down. Also to note, the Proxy agents are reporting directly to the servers.

    I am documenting this really well and will be putting this into a forum post... once the kinks are worked out. Thank you in advance!
    Attached Files
  • jan.garaj
    Senior Member
    Zabbix Certified Specialist
    • Jan 2010
    • 506

    #2
    I'm lost. Can you draw a picture of your setup please (network diagram should be also useful)?

    Keep in mind, that there is no way to scale zabbix server (zabbix-server != zabbix web) horizontally properly. I'm curious how did you solve it.
    Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
    My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant

    Comment

    • five0va
      Junior Member
      • Mar 2015
      • 26

      #3
      Here's a diagram of what we have going on. To get around this issue, we're using lsyncd to keep the following directories in sync: /etc/httpd, /etc/zabbix, /etc/clustershell and /usr/share/zabbix. The lsyncd part works REALLY well. Keepalived is working really well too and handles the VRRP (VIP) between the two servers (and web ui). We have not broken out the UI (yet) as we really see no need for it.. but can do this at any time in the future as this infrastructure is lending itself quite nicely to plug and play. The Server talks back to a Galera (MariaDB) Cluster of 3 nodes (currently running on VMs, but will be converted to physical in about a month) and we are seeing really great performance out of this setup, although, I must confess, I just started getting this going.

      Further up the stack are the proxies, again, relying on Keepalived to keep everything with a VRRP. This link has some really great info on what Keepalived is all about, but it really is quite straight forward and lends itself very well to the scaling that we're looking for. I'll apply lsyncd to these nodes once I get to a rock solid config, but right now I'm using clustershell to "clush" the configs in place and keep them the same.

      I figured out the prx-03 issue, somehow I missed opening port 10050 on that proxy. But auto registration still isn't working correctly and the Agent just complains about how the agent is unknown to the server.
      Attached Files

      Comment

      • jan.garaj
        Senior Member
        Zabbix Certified Specialist
        • Jan 2010
        • 506

        #4
        I think, that you can't balance proxies.
        Zabbix has relation: agent is monitored by this proxy. So only this proxy has information about agent in config cache. If the metric value is received by another proxy (because load balancing in your case), then proxy is not able to find agent in config cache and metric value is discarded. Also IP of agent should be verified on zabbix proxy/server site - IP of your load balancer will be there -> again it can be problem for Zabbix.

        Please dig into the Zabbix source code to prove my idea. Also I recommend to enable debug loglevel and watch your log files.
        Again: it's not easy to scale Zabbix horizontally - IMO it's impossible. But do you really need it? How many nvps do you need to achieve?
        Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
        My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant

        Comment

        • five0va
          Junior Member
          • Mar 2015
          • 26

          #5
          It's not really an issue of pushing Zabbix to it's limits, but proving that it can be done. The failover between proxies does in fact work, proxy 3 goes down and proxy 1 or 2 will pickup the load just fine - although I must confess, I still need to play around with how I have Keepalived setup, but it is in-fact working! On the server side, I did something interesting, I identified the "proxy" (I'll use it with quotes instead of cluster of proxies) by calling it apsd-zbx-prx, instead of calling out each one individually. I noticed that I also needed to put Source IP in the Proxy config as the VRRP address to achieve the proper response from the server. Really, the only thing NOT working is auto-registration. If we can find out how to achieve this, then I'll have a proper load balance of the Proxies up and running in short order, fully documented and everything... I'll even be publishing docker images as time permits as this is our way forward on the proxy side.

          Comment

          • jan.garaj
            Senior Member
            Zabbix Certified Specialist
            • Jan 2010
            • 506

            #6
            five0va, any update / success?
            Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
            My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant

            Comment

            • five0va
              Junior Member
              • Mar 2015
              • 26

              #7
              We were able to get all this working.... still no auto-registration though. I have a post in "Zabbix for Large Environments" that goes over how we set this up. It also has a link to our configs.

              Comment

              Working...