Moderately sized network here. Variety of subnets and sites (multilocation).
Situation currently:
If a router or switch goes down, emails and SMS messages get sent for all the devices that ZABBIX cannot see behind that device as well as for the device.
Status:
Obviously if the router or switch is down, I am unable to see or get reports from those devices beyond it.
Solution needed:
How to set up parent/child type arrangements, wherein if such an occurrence should happen, I should only get notified of the networking device that is down and thus stopping traffic flow from the other devices. This alert should obviously be very important, but does not require multiple alerts.
Example:
Router onsite at client. There are 18 devices within that client's subnet that are also being monitored via Zabbix. VPN tunnel via the router went down, due to the router having an issue. Details not necessary, just know that it caused some headaches and hustling to get the client back to business.
The problem was, that the developers, sysadmins, tech support, and upper management received a whole slew of email alerts and SMS messages for each and every one of those other devices as well, since Zabbix considered them as being 'down' as well.
This caused a certain degree of consternation as my boss tried explaining that those other devices were still up and running at the client, we simply couldn't see them up and active.
When providing SLAs and guaranteeing a certain amount of support, it can be very important to know just what is running and what isn't and why or why not.
It would be so much nicer in an instance such as this one, if when the network device went down, it is obvious that monitoring of the 'children' behind that device will be impossible.
Situation currently:
If a router or switch goes down, emails and SMS messages get sent for all the devices that ZABBIX cannot see behind that device as well as for the device.
Status:
Obviously if the router or switch is down, I am unable to see or get reports from those devices beyond it.
Solution needed:
How to set up parent/child type arrangements, wherein if such an occurrence should happen, I should only get notified of the networking device that is down and thus stopping traffic flow from the other devices. This alert should obviously be very important, but does not require multiple alerts.
Example:
Router onsite at client. There are 18 devices within that client's subnet that are also being monitored via Zabbix. VPN tunnel via the router went down, due to the router having an issue. Details not necessary, just know that it caused some headaches and hustling to get the client back to business.
The problem was, that the developers, sysadmins, tech support, and upper management received a whole slew of email alerts and SMS messages for each and every one of those other devices as well, since Zabbix considered them as being 'down' as well.
This caused a certain degree of consternation as my boss tried explaining that those other devices were still up and running at the client, we simply couldn't see them up and active.
When providing SLAs and guaranteeing a certain amount of support, it can be very important to know just what is running and what isn't and why or why not.
It would be so much nicer in an instance such as this one, if when the network device went down, it is obvious that monitoring of the 'children' behind that device will be impossible.
Comment