Ad Widget

Collapse

Zabbix Alerts While Updating

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • idave
    Junior Member
    • Jul 2018
    • 4

    #1

    Zabbix Alerts While Updating

    Hello all!
    I am trying to figure out the best way to manage Zabbix alerts while Zabbix is updating.
    What happens is that we run automatic updates (via Docker + Watchtower) of our Zabbix install around 2am. I then set up a maintenance period for every day around that time, and I thought I'd be good to go.
    Unfortunately, when Zabbix updates itself, we get a discharge of hundreds of alerts, all saying "Zabbix agent unreachable for more than 5 minutes"

    I am struggling to understand how this is meant to work, because:
    - If we set up the maintenance window "With data collection", then we get the alerts, as per documentation. But still, it is the server that is going down, not the agents.
    - If we set up the maintenance window "Without data collection", then we still get the alerts, because when the maintenance window ends, and Zabbix starts collecting data again, it hasn't heard from the agents for more than 5 minutes.

    I read somewhere on these forums that the solution is to change all triggers to add nodata(), but that would mean changing hundreds of triggers everywhere in Zabbix, and that would take *a lot* of effort.

    Is it really the case that there isn't a better solution, or am I missing something here?
  • kloczek
    Senior Member
    • Jun 2006
    • 1771

    #2
    You did not said anything about how is done this maintenance.
    If it is done by shutting down one by one each docker instance then start new one with updates alert about zabbix agent is caused by killing system which is monitored.
    If upgrade is done by fiddling inside running docker instance that is not usual methodology dealing with docker instances and even with that agent does not need to be shut down for longer than few seconds.
    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
    https://kloczek.wordpress.com/
    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
    My zabbix templates https://github.com/kloczek/zabbix-templates

    Comment

    • idave
      Junior Member
      • Jul 2018
      • 4

      #3
      Originally posted by kloczek
      You did not said anything about how is done this maintenance.
      If it is done by shutting down one by one each docker instance then start new one with updates alert about zabbix agent is caused by killing system which is monitored.
      If upgrade is done by fiddling inside running docker instance that is not usual methodology dealing with docker instances and even with that agent does not need to be shut down for longer than few seconds.
      I believe this is done by shutting down instances and starting new ones. But regardless of how it's done and the specifics of Docker, what's the recommended procedure to shut down Zabbix (possibly for more than 10 minutes) and restarting it without seeing 1000 alerts? It may be entirely possible that we are doing it wrong, but at least if I know at least what the recommended way to upgrade Zabbix is I can work out if Watchtower does it in the appropriate way.

      Besides, this is complaining about agents being shut down, not the server. It should be possible to at least silence the alerts and put Zabbix in "upgrade mode" or something similar.

      Comment

      • cybermcm
        Junior Member
        • Jan 2019
        • 13

        #4
        @idave: I just started to use Zabbix but I've the same problem, doing maintenance tasks at 5 a.m. in the morning, setting a proper maintenance window and still getting the error mails. Did you find a solution?

        Comment

        • idave
          Junior Member
          • Jul 2018
          • 4

          #5
          In essence, no. I am surprised that it's not something anybody has written any article about. I suspect the only way forward is to analyze the logs when it happens, and see if anything can be found there. But what I'll end up doing is probably just stopping auto updates instead.

          Comment

          Working...