Ad Widget

Collapse

Questions on Triggers for Systems that are Highly Available

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • csmall
    Member
    • Jun 2020
    • 70

    #1

    Questions on Triggers for Systems that are Highly Available

    What would be the best approach to make sure a trigger fires off only if a particular number of hosts or items are affected?

    In other words, if you have a system that is in an HA pair and one goes down, the other takes over and is still servicing ...

    Let's say the trigger is icmpping to keep it simple.

    HOST_A and HOST_B are in an HA pair and HOST_A is the active host. HOST_A goes down via icmpping but HOST_B is up and running and handling everything that HOST_A was doing. I don't want to fire a trigger that might come across as an emergency to someone responsible for the systems availability.

    Is it possible to group the two hosts/triggers in some way that addresses this?

    The goal would be to prevent someone from getting an emergency phone call because the trigger fired on HOST_A when in reality the end users of the system are perfectly fine because HOST_B is still running and HOST_A could be addressed without much pressure at a more convenient time.
  • tim.mooney
    Senior Member
    • Dec 2012
    • 1427

    #2
    It's complicated to do that right now, though possible.

    Zabbix 6.0 (which has not been released yet, it's at beta3 right now) has greatly improved support for exactly the kind of HA service-level monitoring you're asking about.

    My recommendation would be to read up on the new features in 6.0, and upgrade to that once it seems like it has stabilized (i.e. not 6.0.0, but maybe 6.0.4 or 6.0.5?).

    Comment

    • cyber
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Dec 2006
      • 4807

      #3
      If you have virtual IP-s there, which move from host to host together with a service, you can add a host, which has this VIP as agent interface. Then you can query for things via that IP. Yes, you need to use passive items, but its ok usually. One check can be also a hostname check, if it changes, you can assume that your service has switched hosts. But until other checks are fine, you can assume, that your service is fine.

      Comment

      Working...