Ad Widget

Collapse

zabbix HA model

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ShivaS
    Member
    • Oct 2005
    • 51

    #1

    zabbix HA model

    Hey, did you think about making some kind of HA model for zabbix?
    It's important thing when you manage hundreds of machines and i am sure nobody wants zabbix server to be single point of failure.
    RAID/double electricity/double switches and so on is nice but..it's not it...

    Right now i am thinking of HA configuration

    Because of the fact that my servers are spread all over the globe, i am implementing the following WAN and LAN solutions.


    WAN:
    - rsync for zabbix website per x time. (just in case)
    - mysql master - slave model through ssh tunnel
    - Once a while to run (crontab ) script on slave server to disable all actions and check availability of master server.

    So once master server is N/A (ping/telnet to zabbix server port, whatever)
    it switches ON all the actions on slave mysql.

    The only disadvantage of this is the master IP where all agents report. going to solve this with LVS ip-to-ip probably.

    For LAN i may suggest the following:
    - rsync web site
    - master - slave model for mysql + IP managed by UCARP
    ucarp will switch between mysql servers
    - heartbeat for zabbix server IP switch

    the advantage of ucarp over lvs is that it supports failover to 1 side only.
    So in case master is down and then restored, it won't switch all traffic back to master. It should be done manually. (after you sync data)

    Probably my solutions are not perfect..but this is what came to my mind right now when i started to think about zabbix HA over LAN and WAN.

    Your suggestions are welcome
    Also would be nice to know that zabbix team is planning sometime introduce HA model too.

    thanks
    Last edited by ShivaS; 27-09-2006, 15:34.
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    Well, actually I do not think that ZABBIX has to incorporate a HA model. Why?

    ZABBIX Server software consists of three independent components (DB, Server process, Front-end) which I believe already fit into any HA model nicely. Support of server binding to a specific IP address is the only missing feature I can think of. This would be extremely useful for HA setups using virtual IP address.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • ShivaS
      Member
      • Oct 2005
      • 51

      #3
      well you are right. probably i took it too far.
      but agree with me for full HA model you need work with 2 mysql servers and 2 zabbix servers. So you need switch IP in case of master failure.
      Frontend has no importance as could be raised later...most important is to keep alerts coming and monitors working.
      So for server could be used LVS and for sql it's better ucarp (old - carp)
      interesting if 2 servers can work well together with 1 mysql and not to conflict...otherwise servers also need to be put on heartbeat or something like that that supports HA model only and not LB.
      actually everything can be done with ultramonkey package for example (lvs/ldirector/heartbeat) the only problem i mentioned earlier is what happens when mysql master is back and all jumps back without sync of master mysql with what it missed when it was down...this problem can be solved by carp only...i am not familiar with other tools though.

      shortly...i think you may start with basic heartbeat implementation

      Comment

      • Alexei
        Founder, CEO
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2004
        • 5654

        #4
        Originally posted by ShivaS
        but agree with me for full HA model you need work with 2 mysql servers and 2 zabbix servers. So you need switch IP in case of master failure.
        Switching from one node to another and change of IP address is job of HA software not ZABBIX. I agree that some HA features can be built in ZABBIX, but I see no much sense as HA can be simply achieved by:

        1. killing ZABBIX on one node
        2. transferring virtual IP to another node
        3. starting ZABBIX on the second node

        Downtime will be very small, it can be less than 10 seconds if implemented nicely.
        Alexei Vladishev
        Creator of Zabbix, Product manager
        New York | Tokyo | Riga
        My Twitter

        Comment

        • ShivaS
          Member
          • Oct 2005
          • 51

          #5
          ok ok but i will anyway use some automation processes ;-)
          Also your solution is good for LAN environments, but WAN you cannot easily switch your IP from one network to another and then "tell" agents to work with new IP.

          Comment

          • Alexei
            Founder, CEO
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Sep 2004
            • 5654

            #6
            Originally posted by ShivaS
            ok ok but i will anyway use some automation processes ;-)
            Also your solution is good for LAN environments, but WAN you cannot easily switch your IP from one network to another and then "tell" agents to work with new IP.
            ZABBIX agents may be configured in advance to have IPs of all ZABBIX nodes provided passive checks are only used. In this cse switch of ZABBIX server from one node to another won't affect agents.
            Alexei Vladishev
            Creator of Zabbix, Product manager
            New York | Tokyo | Riga
            My Twitter

            Comment

            • ShivaS
              Member
              • Oct 2005
              • 51

              #7
              this one i didn't know
              thanks!
              so only mysql left

              Comment

              • LEM
                Senior Member
                Zabbix Certified Specialist
                • Sep 2004
                • 112

                #8
                ZABBIX HA & ZABBIX WAN design

                Hi all,

                I perfectly agree with Alexei: for HA support, just looking at HA support for "zabbix components" (web, MySQL, zabbix_server) is sufficient:
                . LVS for web, eventually using some sync for webpages if sometimes modified (rsync for more than 2 nodes, or drbd for a 2 nodes-only solution),
                . MySQL with drbd on a 2 nodes solution with heartbeat and let's rock. For a more scalable solution, the MySQL clustering engine is to be considered
                . a simple heartbeat for zabbix_agent and let's rock too.

                For the WAN usage of Zabbix, I hope some undergoing jobs about 'distributed monitoring' could helps those who are accustomed to monitor resources 'across' a wide network. See this post for more infos.


                Cheers,
                --
                LEM

                Comment

                • just2blue4u
                  Senior Member
                  • Apr 2006
                  • 347

                  #9
                  Model of HA-Zabbix Server System

                  HA-Zabbix is a very interesting thing, so i made this attached Model of a redundant Zabbix Server System, containing 2 different Zabbix Servers ("imv" and "pbs") on 2 nodes ("A" and "B"). They share all critical Data using DRBD and are clustered behind virtual IPs by Heartbeat.

                  Solid lines represent active links.
                  The dotted ones are "hot-standby". They get activated by Heartbeat if the solid ones fail.
                  As you can see, Node A (Server A) provides all Services (2x Zabbix, DRBD Control) by default.

                  This Model is made with the Software "Dia". Dia-File is attached (rename it to .dia)
                  Feedback is always welcome!

                  I'll try to set this up in a testing environment next days...
                  Attached Files
                  Big ZABBIX is watching you!
                  (... and my 48 hosts, 4513 items, 1280 triggers via zabbix v1.6 on CentOS 5.0)

                  Comment

                  • marc
                    Senior Member
                    • Oct 2004
                    • 146

                    #10
                    iam very interested in HA/clustering too.

                    as is state)
                    actually i have 2 identical setups. (without media/email configuration).
                    both servers are configured to monitor one location. (no clustering yet)

                    1st.) Europe (all Africa hosts are disabled at configuration/hosts, all alarms are configured twice-. one time for all my hosts (disabled), the other alarm is enabled and fires on Host group=europe)

                    2nd.) Africa (all Europe hosts are disabled at configuration/hosts, all alarms are configured twice-. one time for all my hosts (disabled), the other alarm is enabled and fires on Host group=Africa)

                    both servers are monitoring their counterpart on availability.

                    if a server is gone, at the counterpart a trigger gets TRUE and is executing a remote command on localhost. the remote command, a little script activates all hosts from group (Host Group = Africa/Europe) and updating alarm status from disable to enable.

                    with zabbix clustering i have to configure anything twice. (i was able to mysqldump -h localhost zabbix | mysql -u.. -hnode2 -Dzabbix from one host to the other and changed hostname in media screen by hand to keep configuration clean and identically on both hosts. with nodeid records in database this way doesnt work anymore)

                    some inspiration to the dev team)
                    it would be a great benefit to get this functionality by a "preferred master" key at host page and a kind of logic to define master/slave sites.
                    maybe this kind of abstraction is missing at the moment.

                    e.g. the opportunity to define sites (hosts/host groups/users/groups...) and attach them to one or more zabbix Nodes (in cluster env) all nodes have same configuration cause they are linked to abstraction layer. lets call it "site". one is defined as "preferred master" and will get all data from clients. the 2nd nodes main task is watching for "site" 1st nodes availability. optionally a node could monitor one, more or all "site's" hosts. if "preferred master" is gone, 2nd node automatically start polling and alarming issues and stores time information. if master is back online, 2nd. node selects history and trends and other updates from own tables, starts replicating to "preferred master" and if finally finished this tasks, master send an welcome back+ start monitoring. slave waits for about 2 minutes and disabling monitoring/alarming if not optionally enabled.

                    also it would be helpful to have exactly same configuration pages in zabbix clustering for all nodes or at least a "enforce on" button. i want same users/hosts/groups and more all over the cluster. not local configs with lot of redundant informations to keep up2date.

                    thanks for your great work guys.. and let me know your thoughts
                    Last edited by marc; 08-09-2007, 01:47.

                    Comment

                    Working...