We have devices that report alarms by populating a OID table with indexes that look random, at least they do not correspond to specific indexes or alarm types. So we get synchronized tables that show the alarm ID, description, device, etc. No alarms, no entries.
I can report the alarms quite easily; I do an LLD on the table, and create items for each alarm, and a trigger for each item (actually I did 4 triggers of different severity based on alarm reported severity). All works great.
But clearing the alarm state -- no so much.
What I THOUGHT I could do was preserve the LLD item for a while (an hour) and check for no data in the recovery expression (no data is supposed to calculate for unsupported items) but it's not working.
Here is a specific example (sorry, the device name has to be hidden for NDA reasons):
Trigger expression (I have one of these for each severity level, e.g. 1 = critical)
{Template SNMP XXXXXXX Device:severity[{#INDEX}].last(#1)}=1 and
{Template SNMP XXXXXXX Device:time[{#INDEX}].count(#1)}>=-1
Recovery expression (the one here is "true" for no data in 60 seconds, poll on the above is 30 seconds, discovery is 60s):
{Template SNMP XXXXXX Device:severity[{#INDEX}].nodata(60)}=1
Note the second item in the trigger expression is really not used, it is present so it will populate the list of item value macros for use in the outbound email (which works nicely).
If I look in the monitor/trigger, I see this error on the exclamation point:
Cannot evaluate expression: "Cannot evaluate function "XXXXXXX:time[49654].count(#1)": item is not supported."
Now in an hour it all goes away, but I never get a recovery action triggered and associated email. The item is not supported -- I get that. And the trigger expression is unsupported, I get that also, that's expected. But the recovery expression should be supported.
Is that correct? Should a trigger not evaluate the recovery expression even if the trigger expression is unsupported?
I can report the alarms quite easily; I do an LLD on the table, and create items for each alarm, and a trigger for each item (actually I did 4 triggers of different severity based on alarm reported severity). All works great.
But clearing the alarm state -- no so much.
What I THOUGHT I could do was preserve the LLD item for a while (an hour) and check for no data in the recovery expression (no data is supposed to calculate for unsupported items) but it's not working.
Here is a specific example (sorry, the device name has to be hidden for NDA reasons):
Trigger expression (I have one of these for each severity level, e.g. 1 = critical)
{Template SNMP XXXXXXX Device:severity[{#INDEX}].last(#1)}=1 and
{Template SNMP XXXXXXX Device:time[{#INDEX}].count(#1)}>=-1
Recovery expression (the one here is "true" for no data in 60 seconds, poll on the above is 30 seconds, discovery is 60s):
{Template SNMP XXXXXX Device:severity[{#INDEX}].nodata(60)}=1
Note the second item in the trigger expression is really not used, it is present so it will populate the list of item value macros for use in the outbound email (which works nicely).
If I look in the monitor/trigger, I see this error on the exclamation point:
Cannot evaluate expression: "Cannot evaluate function "XXXXXXX:time[49654].count(#1)": item is not supported."
Now in an hour it all goes away, but I never get a recovery action triggered and associated email. The item is not supported -- I get that. And the trigger expression is unsupported, I get that also, that's expected. But the recovery expression should be supported.
Is that correct? Should a trigger not evaluate the recovery expression even if the trigger expression is unsupported?
Comment