Ad Widget

Collapse

Zabbix HA - Active / Active ?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mschlegel
    Member
    • Oct 2008
    • 40

    #1

    Zabbix HA - Active / Active ?

    I am working on building a large scale zabbix deployment, including distributed monitoring of remote sites. High availability is a requirement of the central node 1. Node 1 will also be monitoring a large quantity of local resources as well, which limits the usefulness of an active/passive based high availability solution.

    From early trials, I am running into the following problems:
    1) Database primary keys: Zabbix does its own form of auto increment for the database tables, which completely eliminates technologies like multi-master replication for database high availability. While MySQL does have a cluster option, multi-master replication was the preferred database solution due to reduced management complexity. Using auto increment primary keys, each zabbix_server forming the HA cluster would connect to its own local database and changes are replicated to the other servers, providing a mechanism to further scale the central server as needed.

    2) Second zabbix_server startup: Initial observations appear that the second zabbix_server process using the same database would refuse to start if there were not items available due to be checked in some specific time frame. Once the second server successfully launched, I did not observe any operational problems other than the database problems mentioned earlier. If this problem is caused by not having items ready to be checked, then this problem would likely resolve itself as additional checks are added during the full deployment process.

    3) Unknown - Multiple servers running active host checks? - If multiple zabbix_server's are running, do active checks get run from all zabbix_server processes?

    The ideal deployment in this case would end up as a server farm. Active checks would effectively be handled by a random host in the server farm. Connections to the zabbix server from agents, proxies, or subordinate servers would be into a load balanced server farm address. An extension of this design that should be equally doable would be a server farm of zabbix_proxy servers.

    Is anyone else pursuing a similar zabbix deployment?
  • nelsonab
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2006
    • 1233

    #2
    Right now the only way to get HA to work is Active/Passive. I have made HA work and I've been promising to post the docs on how I did it Real Soon Now but I've got a bug I'm trying to understand how to fix first.

    Ok... I did it... I created a Wiki Page for HA

    http://www.zabbix.com/wiki/doku.php?id=contrib:highavailability

    The two biggest problems with HA are client -> server communication and the database. The client can be configured to talk to multiple servers, but what happens with active checks when both servers are using the same database? As for the database, yes the first node will essentially check all items, as it is the first to see them, however in larger more active environments it is possible a mild race condition would exist. In such a case Host A might ping a client for a some data while Host B pings the same client for the same data point. Two data points will then be placed in the database one or less seconds apart.
    RHCE, author of zbxapi
    Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
    Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

    Comment

    • bbrendon
      Senior Member
      • Sep 2005
      • 870

      #3
      Good to hear you're making progress on this.
      Unofficial Zabbix Expert
      Blog, Corporate Site

      Comment

      • vlam
        Senior Member
        Zabbix Certified Specialist
        • Jun 2009
        • 166

        #4
        Has anyone looked in this again as one of my key requirements from a solutions perspective is to have 99%+ availability of the solution.

        Thanks
        4 Zabbix Frontend Servers (Load balanced)
        2 Zabbix App Servers (HA)
        2 Zabbix Database Servers (HA)
        18 Zabbix Proxy Servers (HA)
        3897 Deployed Zabbix Agents
        6161 Values per second
        X-Layer Integration
        Jaspersoft report Servers (HA)

        Comment

        • bagni
          Senior Member
          Zabbix Certified Specialist
          • Mar 2012
          • 164

          #5
          Hi,
          I've implemented and Active/Passive configuration with MySql Active/Active as reported from this post

          and for now it works perfectly.

          The main problems for a real Active/Active are 2:
          - there isn't autoincrement handled by DB, so forgot wsRep/Galera/Percona cluster systems because you cannot scale on write
          - there isn't a clustering service out of the box by zabbix_server, you cannot split the working queue to different zabbix, then every single instance of zabbix is a world apart.

          Comment

          • Colttt
            Senior Member
            Zabbix Certified Specialist
            • Mar 2009
            • 878

            #6
            @vla. 99% is very easy.. you have more than 3days where the server can goes down..
            see here: https://en.wikipedia.org/wiki/High_availability

            you can use zabbix in ha, please search for corosync, pacemaker and PostgreSQL HA (maybe pg_proxy) or the postgresql-replication, or DRBD.. i has an test-enviroment with that and it works very fine and the downtime is round about 10-15sec
            Debian-User

            Sorry for my bad english

            Comment

            Working...