Ad Widget

Collapse

Trigger if all hosts in a group go down (Active agent)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Chertes
    Junior Member
    • Apr 2022
    • 2

    #1

    Trigger if all hosts in a group go down (Active agent)

    Hello guys,

    I need to monitor for various hosts, the problem is that I cannot do passive checks because they have dynamic IP's and I have no control over the routers, so the solution I found was to use the active checks. Everything works well but I need to set up a trigger: if all the hosts in the group go down, because one computer might go offline randomly, that would just create false alarms, I need to know when all of them go down because that might indicate conectivity issues. Is there any way to achieve this with the active item agent.ping?
  • max.
    Member
    Zabbix Certified Specialist
    • Apr 2022
    • 40

    #2
    Hello,

    You can generate a trigger with multiple conditions.
    Associate with each hosts agent.ping item

    Regards

    Comment

    • Chertes
      Junior Member
      • Apr 2022
      • 2

      #3
      Thanks max!

      What I ended up doing was creating another trigger on one of the hosts in the group and adding the other triggers like this:

      nodata(/host1/agent.ping,{$AGENT.NODATA_TIMEOUT})=1 and nodata(/host2/agent.ping,{$AGENT.NODATA_TIMEOUT})=1 ..etc

      So when both hosts stop seding data the trigger will fire as "Possible internet connection lost" and together with the template default trigger agent.ping, I would know if a host is down or if both go down.

      Comment

      • EdgarGR
        Junior Member
        • Nov 2022
        • 3

        #4
        Im facing same issue, but I guess your workaround won´t be possible to me since I have +20 groups and they need to be dynamically done by the template...
        I´m testing with foreach but seems to not be working fine... any idea?

        My solution was to create a calculated item, using foreach to get "0" if all the hosts in the hostgroup are down. In that case I would have one alert saying that connection to X group is down and not 200 alerts for each host like now.

        I tried with these items, but not successfully:

        count(last_foreach(/*/agent.ping?[group="HOSTGROUP-A"]))

        nodata(last_foreach(/*/agent.ping?[group="HOSTGROUP-A"]))

        I thought this would work, but if I stop the zabbix agent in all the hosts from this Hostgroup, it still returns X, instead of 0.

        I think it just counts the agent.ping items assigned to each host, but not the real output from this item. Maybe because the agent.ping return 1 or "error" if no connection is done.

        Also tried with "nodata", but seems to be not supported together with the _foreach

        Tried with count/sum. Other operators give erros.

        Comment


        • max.
          max. commented
          Editing a comment
          Hello, agent.ping shows 1 or 0. You can assume that if the maximum value is 0 then every host is down.
          Have you tried with the "max" function? check for syntax problems.
      • tim.mooney
        Senior Member
        • Dec 2012
        • 1427

        #5
        Originally posted by EdgarGR
        Im facing same issue, but I guess your workaround won´t be possible to me since I have +20 groups and they need to be dynamically done by the template...
        Ok, so all the hosts in a group may have dynamic IPs, but presumably your network infrastructure (core switches, routers, etc.) doesn't have dynamic IPs, correct? Why not monitor your network topology, and use the network path as a dependency for the hosts?

        Comment


        • EdgarGR
          EdgarGR commented
          Editing a comment
          Hi Tim,
          Thanks for the feedback.

          The hosts I want to monitor have static IPs. These servers are in remote locations and I can't monitor the network because I'm not the owner.

          I'will explain a bit further:
          Zabbix Server -- VPN IPSEC -- Hosts

          I want to monitor the VPN IPSEC status (without access to the Firewall), thats the reason I want to implement a trigger that alerts in case all the Hosts from a HostGroup are down.
          The solution that found Chertes could work but is not efficient/worth the effort because it requires lot of manual work (We are constantly adding new hosts/hostgroups). That's the reason that Im looking for a more automated/standarized solution.

          The idea of the foreach seems to be the way forward, but without success until now.
      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4811

        #6
        #4.1
        max. commented
        Yesterday, 23:03
        Hello, agent.ping shows 1 or 0. You can assume that if the maximum value is 0 then every host is down.
        Have you tried with the "max" function? check for syntax problems.​
        Wrong.. agent.ping returns 1 if it succeeds, does not return anything, if not succeeding... that's also why "last" functions will not work, as last value is always 1, does not matter, at which time...

        Comment

        • max.
          Member
          Zabbix Certified Specialist
          • Apr 2022
          • 40

          #7
          Hello!

          cyber is right, if agent.ping does not succeed it does not return anything.
          So instead we could use count_foreach function to check if any value has been returned by the hosts in the last x minutes.
          It's a little messy but it should work.

          max(count_foreach(/*/agent.ping?[group="TESTGROUP"],5m))

          Then just create a trigger in the virtual host to check if the value is equal to 0.
          Let me know if it works for you

          Comment

          Working...