We had an issue where our Zabbix database was corrupted and MariaDB would not start. Our server is version 4.2.1.
I used the innodb recovery modes to get the MariaDB server started then used mysqldump to backup the database. Our database was very large 230GB so I had to ignore several tables to speed up the process. These tables included acknowledges, alerts, auditlog, auditlog_details, escalations, events, history, history_log, history_str, history_sync, history_text, history_uint, profiles, service_alarms, sessions, trends, user_history, node_cksum. I followed this guide in terms of which tables I should have ignored during the backup to keep only the configuration.
I restored the database and used the schema.sql provided in the source package to recreate the missing tables in the database. After I did this I was able to start the zabbix-server service and things were just about back to normal. I was able to configure innodb_file_per_table which is something I'd been meaning to do on the server for a while so it's not totally in vain.
I kept an eye on the zabbix_server.log and found several errors related to duplicate entries in the event_recovery table. I got a little delete crazy and deleted everything from problems, events and event_recovery. I wanted to clear out all the problems from before the restore and start fresh. After I did this I did not see any more errors in the zabbix_server.log and all new problems were created/resolved correctly according to the defined triggers.
After this I noticed I was unable to delete hosts for whatever reasons. I was able to resolve this by deleted all the data from the housekeeper table.
Now what I expected was for all of the old problems to come back after the trigger criteria was met, but this never happened. Looking at triggers from the configuration page and searching only for triggers with value "problem" I find all of the old problems but they don't appear under the problems page, only here. Now I know I messed something up!
I've been scouring the database looking for any other references to these old problems to try and clear them out but I can't find anything.
Is there a way get the server to re-evaluate all of the items and create the problems that were already existing?
Thank you for any assistance with this.
I used the innodb recovery modes to get the MariaDB server started then used mysqldump to backup the database. Our database was very large 230GB so I had to ignore several tables to speed up the process. These tables included acknowledges, alerts, auditlog, auditlog_details, escalations, events, history, history_log, history_str, history_sync, history_text, history_uint, profiles, service_alarms, sessions, trends, user_history, node_cksum. I followed this guide in terms of which tables I should have ignored during the backup to keep only the configuration.
I restored the database and used the schema.sql provided in the source package to recreate the missing tables in the database. After I did this I was able to start the zabbix-server service and things were just about back to normal. I was able to configure innodb_file_per_table which is something I'd been meaning to do on the server for a while so it's not totally in vain.
I kept an eye on the zabbix_server.log and found several errors related to duplicate entries in the event_recovery table. I got a little delete crazy and deleted everything from problems, events and event_recovery. I wanted to clear out all the problems from before the restore and start fresh. After I did this I did not see any more errors in the zabbix_server.log and all new problems were created/resolved correctly according to the defined triggers.
After this I noticed I was unable to delete hosts for whatever reasons. I was able to resolve this by deleted all the data from the housekeeper table.
Now what I expected was for all of the old problems to come back after the trigger criteria was met, but this never happened. Looking at triggers from the configuration page and searching only for triggers with value "problem" I find all of the old problems but they don't appear under the problems page, only here. Now I know I messed something up!
I've been scouring the database looking for any other references to these old problems to try and clear them out but I can't find anything.
Is there a way get the server to re-evaluate all of the items and create the problems that were already existing?
Thank you for any assistance with this.
Comment