Ad Widget

Collapse

Question about failover scenario

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Mick7
    Junior Member
    • Apr 2010
    • 6

    #1

    Question about failover scenario

    Hi all,

    im looking for a working failover scenario for my main zabbix server. I have already found 1-2 useful threads about it but im still not sure if it will work. I saw too, that some use clustering. Thats not possible for me because of technical reason.

    Scenario is:
    Server A (main zabbix server)
    Server B (zabbix installation, but inactive)
    Database would be a MySQL DB with a Master-Slave replication from Server A to Server B.
    A third zabbix instance (call it "big brother") shall monitor Server A. If Server A goes down, "big brother" shall activate Server B.

    So good so far. In general this should be possible but some questions raised up.

    1) Can Server B work with the replicated database without any changes?
    2) What happens, if server A raise up again unexspected while server B is the new zabbix server?
    3) Both zabbix servers will be hosted from different IP ranges so a virtual IP will not be possible. Is there a way that the agents notice the new server ip?
    Maybe automatic change inside the agents config within a remote control parameter inside a action?
    Or two config files maybe with a restart of the service?

    Any help or suggestions would be really appreciated.
  • Mick7
    Junior Member
    • Apr 2010
    • 6

    #2
    No one?? Not even a hint? I cant believe that im the only one whos looking for a failover scenario.

    Comment

    • nelsonab
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Sep 2006
      • 1233

      #3
      I'm willing to guess you're referring to my Itchy and Scratchy two node "cluster". Have a look at that again, while it may be called a cluster it's really just an Active/Passive setup. Linux HA takes care of failover between the nodes thus there is no real need for a third monitoring node. You can however run a third Linux HA node which can act as the arbitrator for split brain issues.

      The reason I went with DRBD for the MySQL replication was because that was easier to setup in the Linux HA interface than MySQL master/slave.

      To answer your questions more specifcally.

      1) Can Server B work with the replicated database without any changes?
      Only one "node" (HA node not Zabbix DM node) can be writing to the DB at a time. Linux HA can take care of this for you.
      2) What happens, if server A raise up again unexspected while server B is the new zabbix server?
      Cats and dogs coexist in harmony, Republicans and Democrats celebrate in the streets together... really awful things...
      This is a situation you must guard against. The DB needs to be consistent for Zabbix to work right.
      3) Both zabbix servers will be hosted from different IP ranges so a virtual IP will not be possible. Is there a way that the agents notice the new server ip?
      Maybe automatic change inside the agents config within a remote control parameter inside a action?
      Or two config files maybe with a restart of the service?
      There are some ways to do IP failover in a geographically diverse manner. I don't recall what they are but I have read some articles about it when I was building my HA configuration. You can also look at using things like DNS to enable the agents to talk to the right Zabbix server. In such a case you'll have to be sure your timeouts are set pretty low, this might cause some issues with the Zabbix agent internals so you will want to experiment.

      Hopefully this gives you something to think about.
      RHCE, author of zbxapi
      Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
      Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

      Comment

      • Mick7
        Junior Member
        • Apr 2010
        • 6

        #4
        Thank you very much for your answer nelsonab.
        Yes, you gave me something more to think about but it seems that there is no really working solution beside real clustering.
        Meanwhile we decided to use only one zabbix instance (with a DB Replication as Backup) but outsourced to a HA external provider.
        Well, i wish i could control it by myself but its the best solution at the moment i think unless zabbix dont support some own HA strategie.

        ...and for your Point:
        "There are some ways to do IP failover in a geographically diverse manner"

        Thats right of course. I thought about it too.
        But imo you need a extra Server or Client who is delegating the shared IP. So its not a real solution. If the zabbix server goes down you have a problem.
        If the responsible server for the shared IP goes down, you have exactly the same problem. No monitoring anymore...so you just added one more possible point of problems.

        however...
        Again, thank you very much for your try to help.
        Last edited by Mick7; 21-04-2010, 21:58.

        Comment

        Working...