Ad Widget

**kloczek** · 05-03-2019, 12:18

To many notifications is almost always symptom lack of dependencies between triggers.

3 Trigger dependencies

https://www.zabbix.com/documentation/current/manual/config/triggers/dependencies

**syntax53** · 05-03-2019, 14:23

Originally posted by kloczek

To many notifications is almost always symptom lack of dependencies between triggers.

3 Trigger dependencies

https://www.zabbix.com/documentation/current/manual/config/triggers/dependencies

I mentioned in my post that I already have dependencies where I can. There is no dependency I can add that will stop these from happening. Consider the picture below... the switches depend on the MDF's. The MDF's depend on the router. None of these ever go down though. There is no dependency I can add that will stop notifications when all of the switches go down.

**kloczek** · 05-03-2019, 16:20

Originally posted by syntax53

I mentioned in my post that I already have dependencies where I can. There is no dependency I can add that will stop these from happening. Consider the picture below... the switches depend on the MDF's. The MDF's depend on the router. None of these ever go down though. There is no dependency I can add that will stop notifications when all of the switches go down.

Currently zabbix is GoodEnough(tm) if not even perfect on defining and handling host dependencies.
To have something outside that area it needs to be necessary to define metrics on top of multiple per host .. triggers.
Sometimes it is necessary to have something like this.
Typical case is horizontally scaled farm of hosts or few switches providing redundant paths using (fast) spanning tree protocol.
In such cases you may be interested about fact that some number of backbone switches already is dead but as long as still is provides some number of alternative routes it should be not a problem.
With that would be possible to hook under such master trigger that as long as it is between N and N+M paths trigger about critical problem should not fire and even some host (switch) critical alarms could be hidden.

Nevertheless theoretically you can define dummy host metric which will depend on per switch availability metrics connected over "and" operand and some alarm can be displayed only when "switch_A_is_down and switch_B_is_down and switch_C_ ..." then theoretically should be possible to create dependency to hide some per switch alarms. That is only theory because so far it is not possible to create inter host trigger dependencies.
Sometimes it is good .. sometimes not.

I would suggest to use some queuing software because it may be much more effective and deterministic.
On managing messages already in the queue may be done as well for example reordering messages to deliver those with highest severity first.
In worse case scenario such queue could be easily blocked on input or redirected to null and flushed manually if number of messages which still needs to be delivered will be high.

**syntax53** · 05-03-2019, 16:23

You do realize that I posted this in the "cookbook" section and I am simply sharing my solution to this problem...

**tim.mooney** · 02-01-2024, 07:25

Hey syntax53, thank you for sharing your solution for this problem.

My site uses dependencies wherever we can, but like you we have an annoying situation where there's literally no dependency we can use to prevent alert storms for a specific group of systems.

I plan to use your script to help throttle alerts for that group. I have a couple questions about it.

Are you still using it? Have you made any changes since the version you posted here?
I'm stronger in other scripting languages than I currently am in Python, so I'm wondering if you've needed to make any changes to the script to work with recent versions of Python3?

I think I'm actually going to modify the script slightly, so that rather than running as a one-shot frequently from cron, the main body is a permanent loop with a 30 second sleep at the end. I'll write a systemd service file for it, so that systemd starts it and is responsible for restarting it if it ever exits, but it's basically running continuously (but sleeping for most of the time).

Thanks again for sharing. Not only does it likely address the issue we're seeing, but since it's Python it's a good example for me to help improve my Python scripting.

**syntax53** · 02-01-2024, 16:03

Hello Tim. Yes, we still use it. There have been some minor tweaks since last posting. I have updated the pastebin.

Regards,
Matt

Ad Widget

Throttling / Limiting number of notification alerts using Python + API

Throttling / Limiting number of notification alerts using Python + API

Comment

Comment

Comment

Comment

Comment

Comment