Hi have BGP monitoring setup as follows:
Problem Trigger
================
{snmptrap["Peer: {#PEER_IP} has transitioned from \w+ to \w+"].str(from Established)}=1
or
(not({bgpPeerOperationalStatus[{#PEER_IP}].last()}=6) and {bgpPeerAdminStatus[{#PEER_IP}].last()}=2)
Meaning... IF you get trap stating BGP is down OR if the last poll says the BGP state is NOT 6 (Established) and the Admin State is 2 (start... meaning it isn't manually shut down) - THEN trigger.
Recovery condition
====================
Recovery: {snmptrap["Peer: {#PEER_IP} has transitioned from \w+ to \w+"].str(to Established)}=1
or
({bgpPeerOperationalStatus[{#PEER_IP}].last()}=6)
or
({bgpPeerAdminStatus[{#PEER_IP}].last()}=1)
Meaning... IF you get a trap saying it's up OR if the last poll says BGP operational state is 6 (Established) OR if the Admin State is 1 (stop... meaning it is manually shut down) - THEN recover.
Triggers are working fine. Zabbix alerts appropriately. Meaning if I shutdown a session both devices on either end of the BGP session will trigger as being down.
But the recovery fails. On the next polling cycle, Zabbix should see that one of the two BGP sessions is manually shut down (e.g. Admin State is 1). This should recover. But it doesn't.
I do manual SNMP walks and check the latest data from Zabbix itself. It all matches. I even test the expression using the expression tester. But Zabbix just will not recover.
Can anyone assist?
Problem Trigger
================
{snmptrap["Peer: {#PEER_IP} has transitioned from \w+ to \w+"].str(from Established)}=1
or
(not({bgpPeerOperationalStatus[{#PEER_IP}].last()}=6) and {bgpPeerAdminStatus[{#PEER_IP}].last()}=2)
Meaning... IF you get trap stating BGP is down OR if the last poll says the BGP state is NOT 6 (Established) and the Admin State is 2 (start... meaning it isn't manually shut down) - THEN trigger.
Recovery condition
====================
Recovery: {snmptrap["Peer: {#PEER_IP} has transitioned from \w+ to \w+"].str(to Established)}=1
or
({bgpPeerOperationalStatus[{#PEER_IP}].last()}=6)
or
({bgpPeerAdminStatus[{#PEER_IP}].last()}=1)
Meaning... IF you get a trap saying it's up OR if the last poll says BGP operational state is 6 (Established) OR if the Admin State is 1 (stop... meaning it is manually shut down) - THEN recover.
Triggers are working fine. Zabbix alerts appropriately. Meaning if I shutdown a session both devices on either end of the BGP session will trigger as being down.
But the recovery fails. On the next polling cycle, Zabbix should see that one of the two BGP sessions is manually shut down (e.g. Admin State is 1). This should recover. But it doesn't.
I do manual SNMP walks and check the latest data from Zabbix itself. It all matches. I even test the expression using the expression tester. But Zabbix just will not recover.
Can anyone assist?
Comment