Ad Widget

Collapse

Long-event errors and multiple alerts

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • CeeEss
    Senior Member
    Zabbix Certified Specialist
    • Nov 2007
    • 103

    #1

    Long-event errors and multiple alerts

    Just finished "reading" over 1400 email alerts this morning. Issue is with 3 hosts that have a particular non-fatal hardware fault that we poll via SNMP at 30-minute intervals. Since these systems' problems may take days if not weeks to resolve, Zabbix continues to flood with alerts (at less than 30-minute intervals) resulting in hundreds of messages per day. I can ignore them, but noboidy else in the recipient group is too impressed. I would like to adjust alerts so that error alert fires on first occurence (where someone will raise a ticket), but again only after 24 (or more) hours. Can't seem to find a way to do this - does anyone have an idea?

    tia!
  • danrog
    Senior Member
    • Sep 2009
    • 164

    #2
    You can force users to acknowledge the alarms and actually comment on the alert. You would simply have to change the Action Operation to include the condition, Event acknowledged = "Not Ack". Once the alert is ack'd, you will never get an email again (unless you have escalations enabled or the status flaps). You can also include a URL in the alert to make it "easier" for them to ack the alert.

    Comment

    • CeeEss
      Senior Member
      Zabbix Certified Specialist
      • Nov 2007
      • 103

      #3
      Originally posted by danrog
      You can force users to acknowledge the alarms and actually comment on the alert. You would simply have to change the Action Operation to include the condition, Event acknowledged = "Not Ack". Once the alert is ack'd, you will never get an email again (unless you have escalations enabled or the status flaps). You can also include a URL in the alert to make it "easier" for them to ack the alert.
      Setting Event acknowledged = "Not Ack" had a pretty dramatic effect on the number of alerts (thank you!), but hasn't kept them from coming in every poll_interval. The rate they're coming in makes it impossible to acknowledge by hand. No sooner do you ack the trigger, then the item gets polled again 30 mins later, and since error condition still exists the trigger happens all over again.

      Comment

      • CeeEss
        Senior Member
        Zabbix Certified Specialist
        • Nov 2007
        • 103

        #4
        Originally posted by danrog
        You can force users to acknowledge the alarms and actually comment on the alert. <snip>.
        BTW: how do you 'force' users to do anything? We have things called laws in this country.

        Comment

        • danrog
          Senior Member
          • Sep 2009
          • 164

          #5
          Originally posted by CeeEss
          BTW: how do you 'force' users to do anything? We have things called laws in this country.
          Heh, very true. I guess by 'force', I mean 'if they don't want senior management to get upset and force HR to fire them'....

          Comment

          • danrog
            Senior Member
            • Sep 2009
            • 164

            #6
            Originally posted by CeeEss
            Setting Event acknowledged = "Not Ack" had a pretty dramatic effect on the number of alerts (thank you!), but hasn't kept them from coming in every poll_interval. The rate they're coming in makes it impossible to acknowledge by hand. No sooner do you ack the trigger, then the item gets polled again 30 mins later, and since error condition still exists the trigger happens all over again.

            Interesting. We do the exact same thing hardware polling wise. Every 30mins, Zabbix checks status via SNMP; since the polled value hasn't changed, our trigger hasn't changed status and our trigger stays ack'd; therefore no more emails. You can use the bulk ack on the trigger page to make it easier if they come in all at once.

            What does your trigger look like as well as the Action Conditions?
            Last edited by danrog; 03-03-2010, 16:30.

            Comment

            • CeeEss
              Senior Member
              Zabbix Certified Specialist
              • Nov 2007
              • 103

              #7
              Originally posted by CeeEss
              Thanks for your interest, Danrog. Should be a screencap attached somewhere in this thread.

              - cee
              think i may have found it: I had the trigger set as "Normal + Multiple True events". That sounds pretty much like what i'm seeing. Fingers crossed ...

              Comment

              Working...