Greetings fellow Zabbix admins,
We're having a problem with our Zabbix installation that only occurs during large network outages when lots of hosts have triggers in PROBLEM state. When lots of triggers start firing (let's say during an entire datacenter outage), actions eventually stop happening and emails stop coming in. The dashboard does get updated and the server still processes new values, but no emails are sent, and the "Actions" column stays blank, as though it didn't even try to process any actions.
I've found a workaround which has worked every time: stop Zabbix, go into MySQL and "TRUNCATE TABLE actions". Then, when all of our 2 actions have been erased with the truncate table, manually go into the GUI and manually recreate the actions. Then start zabbix. Emails start coming in again.
But it doesn't make sense, and the only thing that changes is that when the actions are recreated, the action ID number increments so a new value for action ID is used. It doesn't seem to be related to server load; load averages are normal while this is occurring, and MySQL looks okay.
Anyone else seeing the same problem, or have any suggestions?
Thanks!
We're having a problem with our Zabbix installation that only occurs during large network outages when lots of hosts have triggers in PROBLEM state. When lots of triggers start firing (let's say during an entire datacenter outage), actions eventually stop happening and emails stop coming in. The dashboard does get updated and the server still processes new values, but no emails are sent, and the "Actions" column stays blank, as though it didn't even try to process any actions.
I've found a workaround which has worked every time: stop Zabbix, go into MySQL and "TRUNCATE TABLE actions". Then, when all of our 2 actions have been erased with the truncate table, manually go into the GUI and manually recreate the actions. Then start zabbix. Emails start coming in again.
But it doesn't make sense, and the only thing that changes is that when the actions are recreated, the action ID number increments so a new value for action ID is used. It doesn't seem to be related to server load; load averages are normal while this is occurring, and MySQL looks okay.
Anyone else seeing the same problem, or have any suggestions?
Thanks!
Comment