Ad Widget

**mcintoshj** · 15-11-2013, 17:11

Maintenance status not in "maintenance"
Trigger value = "PROBLEM"
Trigger severity >= "Average"
Trigger name like "RANDOM_STRING"
Host = "server_in_cluster"

I have an action configured with the above conditions. No recovery message (generally we don't care when there's a recovery). There's one operation - sending an email.

Two things I've noticed - I never received recovery emails (originally had set this up with a recovery email). Are recovery emails tied to the host or the trigger? I'm hoping trigger, as on a host with lots of different triggers, some triggers go to some people, other triggers go to others, and a server can be in a good state but some of the triggers on the server in a bad state. But still no sign of those emails.

The second thing is that I'm getting a TON of messages on repeating basis. The event list shows:

Event list [previous 20]
Time Status Duration Age Ack Actions
14 Nov 2013 11:58:15 OK 20h 46m 15s 20h 46m 15s No
14 Nov 2013 11:43:14 PROBLEM 15m 1s 21h 1m 16s Yes (1) 13
13 Nov 2013 16:30:14 OK 19h 13m 1d 16h 14m Yes (1) 40
13 Nov 2013 16:24:18 PROBLEM 5m 56s 1d 16h 20m Yes (1) 2
13 Nov 2013 16:23:19 OK 59s 1d 16h 21m Yes (1) 2
11 Nov 2013 11:31:44 PROBLEM 2d 4h 51m 3d 21h 12m Yes (1)

SO the problem is "OK" but I'm still getting emails from the event on "14 Nov 2013 11:43:14 PROBLEM 15m 1s 21h 1m 16s Yes (1) 13 " every operation interval. Is there something I'm missing here?

Thanks for any advice/help! This is driving me nuts right now! Here's the expression for the trigger:
{server_in_cluster:rabbitmq[exchange,queue_consumers,queue.name.unknown.errors].max(120)}=0 & {server_in_cluster:rabbitmq[exchange,queue_msgs,queue.name.unknown.errors].max(120)}#0

Ad Widget

Actions still firing when trigger is ok status

Actions still firing when trigger is ok status