Ad Widget

Collapse

Endless notification after edit of trigger in active escalation

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Slash
    Member
    • May 2011
    • 64

    #1

    Endless notification after edit of trigger in active escalation

    Hello everyone,

    I have a very annoying issue with Zabbix... I first configured several triggers to warn me when some value are higher than a specific threshold.

    I then noticed that my threshold was too low because I had alert all the time, so I changed the threshold when the trigger was in the "PROBLEM" state.

    The alert immediately disappeared from the dashboard, as expected, but the notification is still being sent, and since I have an action configured with escalation to send emails every 12 hours when an issue is still in the problem state, my team receive every 12 hours an e-mail like this :

    IO-drive average temperature last 15m > 60° C on srv-157: PROBLEM
    ioDrive internal temperature: 50.2 °C

    Additional information:
    SYSADMIN ALERT
    Host: srv-157
    DateTimeNOW: 2012.12.26-01:57:40
    EventDateTime: 2012.10.15-14:32:14
    EventAge: 71d 12h 25m
    ItemName: ioDrive internal temperature
    ItemLastvalue: 50.2 °C
    Severity: Average

    Escalation history:
    Problem started: 2012.10.15 14:32:14 Age: 71d 12h 25m
    I truncated the end, it has the list of all the e-mails previously sent.

    About the item monitored, its maximum value ever is 56.1 °C, so it is impossible that the threshold is exceeded. Data is collected normally (item is named "io.internal_temperature", collected via a UserParameter).

    Also the trigger expression is : {srv-157:io.internal_temperature.avg(15m)}>60.

    I talked about it on irc and it seems like "apparently it is not a good idea to edit expressions for triggers involved in active escalations". But here I had no choice...

    Is there any way that I can solve this issue ? I have 3 notifications stuck like that. I looked in the database but I can't find what to update/change/delete to kill the notification.

    EDIT : forgot to tell the Zabbix version : 2.0.4, but this issue is here since 2.0.2, I upgraded 2 times zabbix from source since it began.
    Last edited by Slash; 28-12-2012, 11:42.
  • Slash
    Member
    • May 2011
    • 64

    #2
    Just tried this, hopefully it will solve the issue :

    Code:
    zabbix=> select * from escalations;
     escalationid | actionid | triggerid | eventid | r_eventid | nextcheck  | esc_step | status 
    --------------+----------+-----------+---------+-----------+------------+----------+--------
              280 |        8 |     13731 |   20052 |           | 1350036302 |        1 |      2
              283 |        8 |     13730 |   20053 |           | 1350036326 |        1 |      2
              288 |        5 |     13732 |   20054 |           | 1356556250 |      151 |      0
              287 |        7 |     13732 |   20054 |           | 1356556254 |      151 |      0
              309 |        5 |     13723 |   34081 |           | 1356569864 |      145 |      0
              308 |        7 |     13723 |   34081 |           | 1356569868 |      145 |      0
    (6 rows)
    
    zabbix=> delete from escalations where esc_step = '151';
    DELETE 2
    zabbix=> delete from escalations where esc_step = '145';
    DELETE 2
    zabbix=> select * from escalations;
     escalationid | actionid | triggerid | eventid | r_eventid | nextcheck  | esc_step | status 
    --------------+----------+-----------+---------+-----------+------------+----------+--------
              280 |        8 |     13731 |   20052 |           | 1350036302 |        1 |      2
              283 |        8 |     13730 |   20053 |           | 1350036326 |        1 |      2
    (2 rows)

    Comment

    • Rawlings
      Junior Member
      • Sep 2012
      • 24

      #3
      What's the update frequency on your item ?

      Use average function with caution

      Comment

      • Slash
        Member
        • May 2011
        • 64

        #4
        The update frequency is 60 secs on the items in question.

        Comment

        • Slash
          Member
          • May 2011
          • 64

          #5
          After 2 days without alert, I can confirm that removing the rows from the "escalations" table solved the issue.

          Comment

          Working...