Environment: zabbix server 2.0.4 on Ubuntu 12.0.4 LTS
I have a temperature alert on our core switch (it's a stacked switch and we're checking 4 modules). The trigger (reformatted for readability) looks like this
( {TRIGGER.VALUE}=0
&( {edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006. last(0)}>52
|{edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006 .last(0)}>50
|{edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006 .last(0)}>44
|{edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006 .last(0)}>41)
) |
({TRIGGER.VALUE}=1
&( {edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006. count(#10,47,"lt")}>9
&{edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006 .count(#10,45,"lt")}>9
&{edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006 .count(#10,42,"lt")}>9
&{edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006 .count(#10,37,"lt")}>9)
)
Update interval for the ciscoEnvMonTemperatureStatusValue variables is 120 seconds.
What I want to achieve is
- to get one PROBLEM notification when the temperature one of the modules exceeded the indicated value
- to get one OK notification when the temperature of all modules were back to normal for 10 polls
Unfortunately, my formula does not work the way I expected. It looks like the PROBLEM notification works more or less as planned, though I hadn't expected two PROBLEM alerts in the same minute (one for each item which exceeds the threshold).
But the OK notification doesn't work the way I expected. As soon as one of the temperatures falls below the threshold, the alert is cleared. (For example notifications, see below.) And then, of course, it fires again at the next check.
Obviously, I'm going about this the wrong way. Can anyone point me in the right direction?
Kind regards,
Herta
==== example notifications
Zabbix sent a first PROBLEM notification at 04:55 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
A second PROBLEM notification at 04:55 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
4. Temperatuur c4 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006) : 41
an OK notification at 04:56 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 44
an OK notification at 04:56 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 44
4. Temperatuur c4 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006) : 41
a PROBLEM notification at 04:57 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 49
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
A Problem notification at 04:57 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 49
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
4. Temperatuur c4 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006) : 41
I have a temperature alert on our core switch (it's a stacked switch and we're checking 4 modules). The trigger (reformatted for readability) looks like this
( {TRIGGER.VALUE}=0
&( {edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006. last(0)}>52
|{edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006 .last(0)}>50
|{edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006 .last(0)}>44
|{edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006 .last(0)}>41)
) |
({TRIGGER.VALUE}=1
&( {edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006. count(#10,47,"lt")}>9
&{edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006 .count(#10,45,"lt")}>9
&{edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006 .count(#10,42,"lt")}>9
&{edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006 .count(#10,37,"lt")}>9)
)
Update interval for the ciscoEnvMonTemperatureStatusValue variables is 120 seconds.
What I want to achieve is
- to get one PROBLEM notification when the temperature one of the modules exceeded the indicated value
- to get one OK notification when the temperature of all modules were back to normal for 10 polls
Unfortunately, my formula does not work the way I expected. It looks like the PROBLEM notification works more or less as planned, though I hadn't expected two PROBLEM alerts in the same minute (one for each item which exceeds the threshold).
But the OK notification doesn't work the way I expected. As soon as one of the temperatures falls below the threshold, the alert is cleared. (For example notifications, see below.) And then, of course, it fires again at the next check.
Obviously, I'm going about this the wrong way. Can anyone point me in the right direction?
Kind regards,
Herta
==== example notifications
Zabbix sent a first PROBLEM notification at 04:55 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
A second PROBLEM notification at 04:55 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
4. Temperatuur c4 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006) : 41
an OK notification at 04:56 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 44
an OK notification at 04:56 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 48
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 44
4. Temperatuur c4 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006) : 41
a PROBLEM notification at 04:57 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 49
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
A Problem notification at 04:57 with
Item values:
1. Temperatuur c1 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.1006) : 49
2. Temperatuur c2 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.2006) : 48
3. Temperatuur c3 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.3006) : 45
4. Temperatuur c4 (edg00s1ch:ciscoEnvMonTemperatureStatusValue.4006) : 41
Comment