On my Zabbix/BSD installIt seems like nearly every time I do an upgrade, the database upgrade fails. It's a non-production environment, so I've never dug into the root cause of the upgrade failure, but each time I have to help the process along. I'm posting partially just to help anyone running into this, but certainly would be interested if someone has an idea on root cause.
First of all, troubleshooting works best with 3 ssh sessions: Zabbix Log tail, mysql changes, zabbix service restart. It also helps to have a copy of the database schema (for the version you are upgrading to) open, I found mine at https://fossies.org/linux/zabbix/dat...sql/schema.sql . You'll use this to manually create the constraint/key that's failing.
This sequence assumes you've already tried to go to your zabbix front end and gotten the database upgrade dialog, or you see something like:"current database version (mandatory/optional): 06010082/06010082" "required mandatory version: 06020000" in zabbix_server.log
Open first ssh connection and tail the zabbix log file, on my BSD instance it's 'tail -f /tmp/zabbix_server.log'
Open second ssh connection and start zabbix server, 'service zabbix_server start'
Open third ssh connection and log into mysql, 'mysql -u zabbix -p', switch to zabbix db with 'use zabbix'
The two most common database failure scenario I'm seeing are that the upgrade script wants to drop an constraint/key, but the constraint/key doesn't exist. The two variations are 1. manually create the constraint/key and upgrade moves on, OR 2. constraint/key creation fails, so you have to clear out some records I a table, then manually create the constraint/key and upgrade moves on.
I've included some samples sequences of zabbix_server start, log output, sql remediation below.
Scenario 1: Creating index so script can drop it
49254:20221026:085715.351 starting automatic database upgrade
49254:20221026:085715.352 [Z3005] query failed: [1091] Can't DROP 'c_functions_1'; check that column/key exists [alter table functions drop foreign key c_functions_1]
49254:20221026:085715.352 database upgrade failed
ALTER TABLE `functions` ADD CONSTRAINT `c_functions_1` FOREIGN KEY (`itemid`) REFERENCES `items` (`itemid`);
Scenario 2: Deleting records with missing relationship values, then creating index so script can drop it
54035:20221026:091226.373 starting automatic database upgrade
54035:20221026:091226.743 [Z3005] query failed: [1091] Can't DROP 'c_item_preproc_1'; check that column/key exists [alter table item_preproc drop foreign key c_item_preproc_1]
54035:20221026:091226.743 database upgrade failed
#FIRST ATTEMPT TO ALTER TABLE FAILS
ALTER TABLE `item_preproc` ADD CONSTRAINT `c_item_preproc_1` FOREIGN KEY (`itemid`) REFERENCES `items` (`itemid`);
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`zabbix`.`#sql-bb55_6f2`, CONSTRAINT `c_item_preproc_1` FOREIGN KEY (`itemid`) REFERENCES `items` (`itemid`))
# SELECT RECORDS FROM PROBLEM TABLE TO GET AN IDEA OF HOW MANY RECORDS ARE MISSING RELATED DATA
SELECT * FROM item_preproc WHERE item_preproc.itemid NOT IN (SELECT items.itemid FROM items)
# CAUTION!!!! IF YOU ARE GOOD WITH DUMPING THE PROBLEM RECORDS, REMOVE COMMENT FROM BELOW AND RUN
# DELETE FROM item_preproc WHERE item_preproc.itemid NOT IN (SELECT items.itemid FROM items)
On my upgrade from 6.0 to 6.2, I had issues with:
Using the log entry, [Z3005] query failed: [1091] Can't DROP 'c_item_preproc_1'; check that column/key exists [alter table item_preproc drop foreign key c_item_preproc_1] , you should note that c_item_preproc_1 is what you'll be searching the schema for, to see what SQL command to run to create the constraint/key. item_preproc is the table you'll be working with, and the table name you'd sub-in for the Select and Delete statements. You have to look at the ALTER TABLE line to determine what the constraint relationship is in order to finish out the select statements.
I don't know that I'd do this in production, or I'd certainly dig into which records have issues, to try and determine what is going on. But if you just need to get things running after a failed upgrade, maybe this will help.
First of all, troubleshooting works best with 3 ssh sessions: Zabbix Log tail, mysql changes, zabbix service restart. It also helps to have a copy of the database schema (for the version you are upgrading to) open, I found mine at https://fossies.org/linux/zabbix/dat...sql/schema.sql . You'll use this to manually create the constraint/key that's failing.
This sequence assumes you've already tried to go to your zabbix front end and gotten the database upgrade dialog, or you see something like:"current database version (mandatory/optional): 06010082/06010082" "required mandatory version: 06020000" in zabbix_server.log
Open first ssh connection and tail the zabbix log file, on my BSD instance it's 'tail -f /tmp/zabbix_server.log'
Open second ssh connection and start zabbix server, 'service zabbix_server start'
Open third ssh connection and log into mysql, 'mysql -u zabbix -p', switch to zabbix db with 'use zabbix'
The two most common database failure scenario I'm seeing are that the upgrade script wants to drop an constraint/key, but the constraint/key doesn't exist. The two variations are 1. manually create the constraint/key and upgrade moves on, OR 2. constraint/key creation fails, so you have to clear out some records I a table, then manually create the constraint/key and upgrade moves on.
I've included some samples sequences of zabbix_server start, log output, sql remediation below.
Scenario 1: Creating index so script can drop it
49254:20221026:085715.351 starting automatic database upgrade
49254:20221026:085715.352 [Z3005] query failed: [1091] Can't DROP 'c_functions_1'; check that column/key exists [alter table functions drop foreign key c_functions_1]
49254:20221026:085715.352 database upgrade failed
ALTER TABLE `functions` ADD CONSTRAINT `c_functions_1` FOREIGN KEY (`itemid`) REFERENCES `items` (`itemid`);
Scenario 2: Deleting records with missing relationship values, then creating index so script can drop it
54035:20221026:091226.373 starting automatic database upgrade
54035:20221026:091226.743 [Z3005] query failed: [1091] Can't DROP 'c_item_preproc_1'; check that column/key exists [alter table item_preproc drop foreign key c_item_preproc_1]
54035:20221026:091226.743 database upgrade failed
#FIRST ATTEMPT TO ALTER TABLE FAILS
ALTER TABLE `item_preproc` ADD CONSTRAINT `c_item_preproc_1` FOREIGN KEY (`itemid`) REFERENCES `items` (`itemid`);
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`zabbix`.`#sql-bb55_6f2`, CONSTRAINT `c_item_preproc_1` FOREIGN KEY (`itemid`) REFERENCES `items` (`itemid`))
# SELECT RECORDS FROM PROBLEM TABLE TO GET AN IDEA OF HOW MANY RECORDS ARE MISSING RELATED DATA
SELECT * FROM item_preproc WHERE item_preproc.itemid NOT IN (SELECT items.itemid FROM items)
# CAUTION!!!! IF YOU ARE GOOD WITH DUMPING THE PROBLEM RECORDS, REMOVE COMMENT FROM BELOW AND RUN
# DELETE FROM item_preproc WHERE item_preproc.itemid NOT IN (SELECT items.itemid FROM items)
On my upgrade from 6.0 to 6.2, I had issues with:
Functions table - (alter table functions drop foreign key c_functions_1) AND (alter table functions drop foreign key c_functions_2)
Triggers table - (alter table trigger_tag drop foreign key c_trigger_tag_1)
Item_preproc table - (alter table item_preproc drop foreign key c_item_preproc_1)
Item_preproc table - (alter table item_preproc drop foreign key c_item_preproc_1)
Using the log entry, [Z3005] query failed: [1091] Can't DROP 'c_item_preproc_1'; check that column/key exists [alter table item_preproc drop foreign key c_item_preproc_1] , you should note that c_item_preproc_1 is what you'll be searching the schema for, to see what SQL command to run to create the constraint/key. item_preproc is the table you'll be working with, and the table name you'd sub-in for the Select and Delete statements. You have to look at the ALTER TABLE line to determine what the constraint relationship is in order to finish out the select statements.
I don't know that I'd do this in production, or I'd certainly dig into which records have issues, to try and determine what is going on. But if you just need to get things running after a failed upgrade, maybe this will help.
Comment