We had an issue over the weekend. We were performing maintenance in our production environment. We set our hosts in to Maintenance Mode.
During the work, we Acknowledged the alerts and let the maintenance run out. Once it did, we started receiving page alerts for the hosts that were alerting but acknowledged.
Even setting maintenance back on the hosts, Zabbix continued to send out emails to our paging system (I was tailing the maillog and could see Zabbix sending out messages).
After some zabbix-server restarts, we ended up disabling the hosts which sent a final "Escalation cancelled: host disabled" page.
After some period of quiet, we re-enabled the hosts.
All our Actions have the Condition of Event acknowledged = "Not Ack" in order to qualify to page us.
This was very frustrating for us. Any idea of what may have happened?
During the work, we Acknowledged the alerts and let the maintenance run out. Once it did, we started receiving page alerts for the hosts that were alerting but acknowledged.
Even setting maintenance back on the hosts, Zabbix continued to send out emails to our paging system (I was tailing the maillog and could see Zabbix sending out messages).
After some zabbix-server restarts, we ended up disabling the hosts which sent a final "Escalation cancelled: host disabled" page.
After some period of quiet, we re-enabled the hosts.
All our Actions have the Condition of Event acknowledged = "Not Ack" in order to qualify to page us.
This was very frustrating for us. Any idea of what may have happened?
Comment