Ad Widget

Collapse

I shut down Zabbix to stop the alerts! Help!

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MrKen
    Senior Member
    • Oct 2008
    • 652

    #1

    I shut down Zabbix to stop the alerts! Help!

    Eight days ago, a flapping snmptrap caused the Zabbix Server to run out of diskspace and crash. The server log had lots of sql errors about not being able to find various temporary tables - the database was now corrupted, and Zabbix slowed to a snails pace.

    I dumped the database. Re-imported the database. Then ran mysqlcheck to check, repair, and optimize the tables.

    Yesterday at 2:00 pm I restarted Zabbix Server. It was still very sluggish, and the queue had over 4000 items (all types - agentd, snmp, simple checks), that's more than 50% of all my items. Then suddenly the alerts started - and haven't stopped. I disabled the offending Trigger, and even disabled *all* alerts, but to no avail!
    Interestingly, all the alerts are dated from 8 days ago.

    Last night I turned off my phone because the alerts kept coming. This morning my email has over 5000 new mails.

    Today, I shut down Zabbix. The alerts stopped (after a while).

    Tomorrow, what to do? I know that as soon as I restart Zabbix the alerts will restart too. Is there somewhere I can remove that queue of alerts/mail.

    Alternatively, could I use mysql to remove all data since an hour or so before the crash? Or maybe I need to DROP the database and start fresh (not a good idea)

    Open to suggestions,

    MrKen
    Disclaimer: All of the above is pure speculation.
  • skogan
    Member
    • Nov 2007
    • 70

    #2
    Three Inconvenient Truths that a Zabbix User Usually Learns the Hard Way:

    1. The short item update interval dogma stinks.

    Solution: Increase the item update interval to something more sane than the (often) recommended "as frequent as possible". I'd recommend at least 5 minutes - that's 20 times the default of 30 seconds.

    2. Zabbix notification system is strikingly naive.

    Solution: Implement an external notification system with ability to schedule downtimes and set up notification delays. It's not an easy task, but completely vital if you want to have any sleep at night.
    BTW, the min() max() functions, so loved by the Zabbix Propaganda Team are NOT ENOUGH to let one sleep at night - with the current state of affairs it is virtually IMPOSSIBLE to effectively supress flapping alarms.

    3. Soft state triggers are a necessity and must be made a priority in development.

    Solution: None at the moment - the current implementation of the agent protocol does not allow for a simple solution of this problem, especially in an active agent setup. Also, Zabbix developers' extreme uninterest in this functionality doesn't really help.

    If you want, I can give you the code for an external notification system that I developed - you may be able to adopt it to your needs. It allows downtime scheduling through the IT services, working hours scheduling and delayed notifications - which effectively deal with unwanted flapping alarms. Send me a private message if you're interested.

    Comment


    • doctorbal82
      doctorbal82 commented
      Editing a comment
      Skogan, I'd be very interested in the code you stated.

      Please PM me.

      I know this is an old post but maybe you're still around :-).

    • ITIC007
      ITIC007 commented
      Editing a comment
      can you please share the code! Interested!
  • MrKen
    Senior Member
    • Oct 2008
    • 652

    #3
    Yesterday I restarted the Zabbix server and sure enough the alerts started coming shortly after.

    It is now 8.50 am and I have 7315 email alerts received since yesterdays restart. Oh, and they're still coming!

    How to stop the cached alerts? Where is the Emergency Stop button?

    Surely there must be some way to stop this?
    Disclaimer: All of the above is pure speculation.

    Comment

    • igor
      ZABBIX Support Specialist
      • Mar 2009
      • 40

      #4
      MrKen, in order to stop the alerts you need to:
      1. stop zabbix_server
      2. Execute the following SQL statements:

      update alerts set status=2,error='' where status=0 and alerttype=0;
      delete from escalations;

      3. Then in the Zabbix frontend you should disable all actions for which there is operation type="send message"
      4. start zabbix_server

      Comment

      • MrKen
        Senior Member
        • Oct 2008
        • 652

        #5
        Thanks igor,

        I'd like to say that it worked, but I'm not really sure yet. The sql definitely yielded a result, look at this:

        mysql>
        mysql> update alerts set status=2,error='' where status=0 and alerttype=0;
        Query OK, 1594078 rows affected (4 min 30.37 sec)
        Rows matched: 1594078 Changed: 1594078 Warnings: 0

        That's a lot of matching rows!

        I was very interested to check the Administration --> Notifications tab to see how many alerts had been sent, but the page won't load (same for Administration --> Audit). I'm wondering whether my Alerts table is slightly too heavy.
        Alerts table = 1.9 Gb
        History table = 5.2 Gb

        compare that to my zabbix 1.4.2 box:
        Alerts table = 14 Mb
        History = 15 Gb

        Now let's say that the most important thing is to get this server (1.6.5) back up and running, and I'm not interested in seeing the past alerts, would it be right/safe to do this:

        Delete * from alerts;

        RSVP
        Thanks again

        MrKen
        Disclaimer: All of the above is pure speculation.

        Comment

        • igor
          ZABBIX Support Specialist
          • Mar 2009
          • 40

          #6
          Hi!

          Yes, if you do not want to see past alerts you can execute SQL statement "Delete * from alerts;" in order to empty the alerts table.
          The SQL statement "update alerts set status=2,error='' where status=0 and alerttype=0;" changed status of all alerts from "not sent" to "failed" in order to stop the Zabbix alerts.

          Comment

          • MrKen
            Senior Member
            • Oct 2008
            • 652

            #7
            Great stuff yuri!

            Deleted everything from alerts, then optimized the alerts table, it's now only 192K.

            Frontend all appears to running smoothly again.

            Thanks again.

            MrKen
            Disclaimer: All of the above is pure speculation.

            Comment

            Working...