Ad Widget

Collapse

Run Zabbix in HA cluster

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • olegus
    Member
    • Dec 2023
    • 68

    #1

    Run Zabbix in HA cluster

    I currently run Zabbix server from dockers and use docker compose with server, MySQL, web and agent services. It all takes 1 Linux VM.
    If I get it correctly, in order to run Zabbix in HA clusters I'd need:
    - at least two VMs with Zabbix server installation
    - one VM for a DB server.
    - one VM for a frontend (or combined with DB server VM) ?
    - change docker-compose to install only server part on HA nodes and only MySQL/Web on a DB/Web node
    - setup DB connections on server and Web to point to the separate DB server

    Does it sound about right?

    Also
    Docs here - https://www.zabbix.com/documentation...epts/server/ha suggest that all I need to do is just to add 2 more parameters
    HANodeName​ (unique node name) and NodeAddress (FQDN of the VM where instance is installed), but I wonder how servers will know where their peers are located if there are no references in config? Do they just discover each other on the same network create a cluster that includes all discovered instances ? Does it mean that I cannot have more than one cluster in the same network?


  • MRedbourne
    Senior Member
    • Feb 2023
    • 103

    #2
    Hey Mate,

    Hopefully this helps you. I'm not familiar with Docker, so I can't comment on that part. But hopefully this guidance is enough for you to extrapolate what to do with Docker.

    If I get it correctly, in order to run Zabbix in HA clusters I'd need:
    1. at least two VMs with Zabbix server installation
    2. one VM for a DB server.
    3. one VM for a frontend (or combined with DB server VM) ?
    4. change docker-compose to install only server part on HA nodes and only MySQL/Web on a DB/Web node
    5. setup DB connections on server and Web to point to the separate DB server
    1) Is universally true for any HA mode of any vendor/application.
    2) Correct. However, the DB can be combined with components as desired.
    3) Correct. See point #2 about combining the front end with other 'components' as desired.
    4) I don't know enough about docker to answer that.
    5) Yes. Bear in mind your database in an HA setup will need to be a networked database (Eg: MySQL, PostgreSQL, MariaDB, etc). It cannot be for example, SQLite.

    There are some things to know about Zabbix.

    1) The most strenuous component is the database.
    2) You can separate all components of Zabbix, combine all components, or combine any combination of components.

    Our fully-HA setup is:
    • Database: Azure DB for MySQL (HA-enabled, Geo-redundant)
    • 'Active' VM: RHEL9 (64 bit) running Apache/PHP and Zabbix Server
    • 'Passive' VM: RHEL9 (64 bit) running Apache/PHP and Zabbix Server.
    ​With our 2 VMs, the Zabbix Servers are Active/Passive while the web servers are Active/Active. This removes the DB load from the server for us and provides us with *true* high availability. The key part here though is keeping the DB away from everything else. If you're not familiar with setting up redundant DBs (and you need HA functionality), I'd highly suggest looking at Azure or AWS for that functionality.

    Docs here - https://www.zabbix.com/documentation...epts/server/ha suggest that all I need to do is just to add 2 more parameters​
    Yes, from memory you only need to change the two parameters in the documentation for enable HA.

    but I wonder how servers will know where their peers are located if there are no references in config? Do they just discover each other on the same network create a cluster that includes all discovered instances?
    For us, everything is controlled by DNS. Our VMs point their DNS requests to Active Directory Domain Services (Domain Controllers) which have records to each other. I'm not a Zabbix expert. I suspect (I'll go looking tomorrow) that both servers log "last_access" in the database, which is how both servers track which node is active, and when the last time both were seen.

    Does it mean that I cannot have more than one cluster in the same network?
    It shouldn't. They'd just need to be connected to a different DB.​

    Edit: Fixed a small BB Code typo.

    Comment

    • Semiadmin
      Senior Member
      • Oct 2014
      • 1625

      #3
      Originally posted by MRedbourne
      I suspect (I'll go looking tomorrow) that both servers log "last_access" in the database, which is how both servers track which node is active, and when the last time both were seen..
      Exactly. Each node writes information about itself into the ha_node table of the Zabbix database. Each node, whether active or standby, periodically reads data about its peers from this table.

      Comment

      • olegus
        Member
        • Dec 2023
        • 68

        #4
        Thanks!
        I think servers, DB and frontend have to reside on separate nodes.
        If DB is located on the Zabbix server node and this node is down, switching to passive server wont help as DB is down too.
        And if we dont have any load balancer in front of Zabbix frontend , it will be a problem to switch to another frontend instance as IP would be different.

        Comment

        • MRedbourne
          Senior Member
          • Feb 2023
          • 103

          #5
          Originally posted by olegus
          I think servers, DB and frontend have to reside on separate nodes. If DB is located on the Zabbix server node and this node is down, switching to passive server wont help as DB is down too.
          Yes, of course. It's the reason why our DB runs on a completely separate node in Azure. We use a PaaS/DBaaS though for it.

          Originally posted by olegus
          And if we dont have any load balancer in front of Zabbix frontend , it will be a problem to switch to another frontend instance as IP would be different.
          ​​​​​​​
          Part of that is true. We don't load balance our web front ends at all, they both operate at the same time, without issue. We haven't had to make any modifications to it either. You'll just have two access URLs you can hit instead of one. Internally our staff can hit zabbix01.example.com and zabbix02.example.com. Both work, both respond. Both serve SAML/SSO requests for authentication in Entra ID (Azure Active Directory). They just use different URLs. In our case, it removed the need to have another 3rd party load balancer, saving us money and complexity. Whether that is worth it to you (having a single access point vs. cost savings) is a decision to be made by you/your org.

          Comment

          • olegus
            Member
            • Dec 2023
            • 68

            #6
            Originally posted by MRedbourne
            Yes, of course. It's the reason why our DB runs on a completely separate node in Azure. We use a PaaS/DBaaS though for it.


            Part of that is true. We don't load balance our web front ends at all, they both operate at the same time, without issue. We haven't had to make any modifications to it either. You'll just have two access URLs you can hit instead of one. Internally our staff can hit zabbix01.example.com and zabbix02.example.com. Both work, both respond. Both serve SAML/SSO requests for authentication in Entra ID (Azure Active Directory). They just use different URLs. In our case, it removed the need to have another 3rd party load balancer, saving us money and complexity. Whether that is worth it to you (having a single access point vs. cost savings) is a decision to be made by you/your org.
            So you run frontends on server machines? And if one VM is down you simply hit another URL? Yea, that probably should work OK. Or run frontend in its own HA mode (like in k8s)

            Side question- how do you guys prefer to install Zabbix -dockers CLI, compose or packages ? I usually prefer docker-compose way, but zabbix official compose files are so overengineered that I am going either to simplify them to my own taste ( which is bad as now I have to maintain them) or just to use cli or even install from packages.

            Comment

            • MRedbourne
              Senior Member
              • Feb 2023
              • 103

              #7
              Originally posted by olegus
              So you run frontends on server machines? And if one VM is down you simply hit another URL? Yea, that probably should work OK. Or run frontend in its own HA mode (like in k8s)
              Yes. Whether that's the proper way of doing things, who knows. But our environment is relatively small. The overhead with the web server is small enough that it doesn't impact performance noticeably. At some point we will probably introduce a proper load balancer for the web front ends, just happened to be we didn't need it when we initially deployed. Getting approval for the load balancer at the time, wouldn't have been worth the headache, beyond the additional cost.

              Originally posted by olegus
              Side question- how do you guys prefer to install Zabbix -dockers CLI, compose or packages ? I usually prefer docker-compose way, but zabbix official compose files are so overengineered that I am going either to simplify them to my own taste ( which is bad as now I have to maintain them) or just to use cli or even install from packages.
              We went with packages on full (RHEL9) VMs. Zabbix configuration files are a handful sometimes. I slimmed down the agent configs and removed a lot of the comments and whatnot, keeping only what I actually needed. I also split my configs across by role.
              • /etc/zabbix/zabbix_agent2.conf <== Primary Config
              • /etc/zabbix/zabbix_agent2.d/00-UserParams.conf
              • /etc/zabbix/zabbix_agent2.d/01-SystemRun.conf
              • /etc/zabbix/zabbix_agent2.d/02-Encryption.conf
              • /etc/zabbix/zabbix_agent2.d/03-Performance.conf

              Comment

              • Brambo
                Senior Member
                • Jul 2023
                • 245

                #8
                olegus I see there is a HA setup webinar on 25-january maybe a good one to follow.

                Comment

                Working...