Ad Widget

Collapse

interface flapping trigger

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Vermizz
    Member
    • Oct 2022
    • 33

    #1

    interface flapping trigger

    Hi,
    I have the following problem. Monitors Cisco switches using the template:
    https://www.zabbix.com/integrations/cisco_snmp
    If the interface is unavailable, a trigger is triggered and a notification is received. This works fine, but I'd like to achieve the following scenario when interface start flapping:
    1. Gets one notification after interface down.
    2. Get the next notification only after 5 minutes if the problem occurs again. For this period of 5 minutes, I don't want the trigger to fire while the interface is flapping.

    Current trigger:
    Problem: {$IFCONTROL:"{#IFNAME}"}=1 and (last(/Generic Cisco/net.if.status[ifOperStatus.{#SNMPINDEX}])=2 and (last(/Generic Cisco/net.if.status[ifOperStatus.{#SNMPINDEX}],#1)<>last(/Generic Cisco/net.if.status[ifOperStatus.{#SNMPINDEX}],#2))=1)
    Recovery: last(/Generic Cisco/net.if.status[ifOperStatus.{#SNMPINDEX}])<>2

    Example on the interface:
    Problem: {$IFCONTROL:"Gi0/4"}=1 and (last(/testsw/net.if.status[ifOperStatus.10104])=2 and (last(/testsw/net.if.status[ifOperStatus.10104],#1)<>last(/testsw/net.if.status[ifOperStatus.10104],#2))=1)
    Recovery: last(/testsw /net.if.status[ifOperStatus.10104])<>2

    How can I change this trigger?
  • Answer selected by Vermizz at 05-08-2023, 09:10.
    ISiroshtan
    Senior Member
    • Nov 2019
    • 324

    BraT3C While direction of though is correct the suggested implementation is flawed. last(,5m) will not check all values over 5 minutes. It would check what was the last value 5 minutes ago (and even that behavior was changed in later versions of Zabbix and now last() only accepts count, like #5 and does not accept time like 5m). Please reference supported functions at.

    Vermizz Tho it's a bit tricky to just give you exact function, as any hysteresis implies ignoring some data assuming it's still related with existing problem. Hence it highly dependent on what exact behavior you want to detect/ignore.
    I assume your switch complies with this MIB statements and interface can report only 3 states: 1 - UP, 2 - DOWN and 3 - testing. I would ignore existence of testing state for the sake of simplifying monitoring. As such you can use recovery expression like max(/host/key,5m)=1 . This way alert will fire when host interface changes state from UP to DOWN, as it is now. But It will also remain open until interface constantly in UP state for last 5 minutes. So if there is any kind of flapping - alert will not flap but will just remain fired until flapping is gone and interface is in stable UP state for 5 minutes.

    Comment

    • Vermizz
      Member
      • Oct 2022
      • 33

      #2
      Can anyone help me about this problem?

      Comment

      • BraT3C
        Junior Member
        • Jul 2023
        • 2

        #3
        Hi,

        I think that you need to modify your recovery expression.

        Code:
        [FONT=Calibri]last(/Generic Cisco/net.if.status[ifOperStatus.{#SNMPINDEX}],5m)<>2[/FONT]​
        So that when operating status is different than 2 for more than 5 minutes, the trigger will disappear.
        I am no professional though.

        Comment

        • ISiroshtan
          Senior Member
          • Nov 2019
          • 324

          #4
          BraT3C While direction of though is correct the suggested implementation is flawed. last(,5m) will not check all values over 5 minutes. It would check what was the last value 5 minutes ago (and even that behavior was changed in later versions of Zabbix and now last() only accepts count, like #5 and does not accept time like 5m). Please reference supported functions at.

          Vermizz Tho it's a bit tricky to just give you exact function, as any hysteresis implies ignoring some data assuming it's still related with existing problem. Hence it highly dependent on what exact behavior you want to detect/ignore.
          I assume your switch complies with this MIB statements and interface can report only 3 states: 1 - UP, 2 - DOWN and 3 - testing. I would ignore existence of testing state for the sake of simplifying monitoring. As such you can use recovery expression like max(/host/key,5m)=1 . This way alert will fire when host interface changes state from UP to DOWN, as it is now. But It will also remain open until interface constantly in UP state for last 5 minutes. So if there is any kind of flapping - alert will not flap but will just remain fired until flapping is gone and interface is in stable UP state for 5 minutes.

          Comment

          • Vermizz
            Member
            • Oct 2022
            • 33

            #5
            Hi,
            Thank you for advice.
            I use you example max(/host/key,5m)=1​ and work as I want.

            Comment

            Working...