I've been trying to do this. So far, the steps have been, roughly.
1. Stop zabbix.
2. Run the initial database preparation script. This creates the new table structure and renames the existing history_X tables as history_X_old.
3. Re-run the sql procedure that creates the initial partitioning on the new tables, making (in my case) 186 new daily partitions.
4. Reinsert the old data into the new tables.
I've been stuck on (4) and a way to do this effectively.
If I try to just 'add the index' manually to the partitioned table, due to race conditions or errors, I have to fix many duplicate records. Even though this should never happen, in a large enough environment, undiscovered bugs have caused zabbix to insert the same data multiple times into the history table. Fixing these is not easy.
a. running a query to discover them is impossible. The table is too big and it takes too long.
b. Fixing them one by one as you encounter them might be a fool's errand. There's no way of telling how many there will be and it might take really long to wait half an hour each time for the index building query to decide to barf on a duplicate row.
If I try to copy the database wholesale, I run into a mysql bug, and mysql will crash with a problem with low level file/memory operations. (Already filed this bug). It's too big.
So I've resolved to try to move the history data bit by bit with smaller queries. Let's say one hour at a time, from the most recent to older. I've tried this code;
Unfortunately for me, it's been running for a couple hours already and has failed to even do the first transaction: no sign of 'history_str' being printed in the output so far, but it did print the first unixtime, so it's stuck on the first INSERT or DELETE.
Clearly, this method is too slow if it'll take longer than it took to write the history data to transfer it.
Edit: It just did the first set in 1hr 34 min. So ETA so far is 4.5 months.
The interesting part is that history took that hour and a half. The others (_str, _text, _uint), ... less than a millisecond.
Are there SQL experts who could help writing faster code to get the poorly indexed old database' data into the new one? Or am I stuck just ditching 3 months of history (people will probably complain about that)?
1. Stop zabbix.
2. Run the initial database preparation script. This creates the new table structure and renames the existing history_X tables as history_X_old.
3. Re-run the sql procedure that creates the initial partitioning on the new tables, making (in my case) 186 new daily partitions.
4. Reinsert the old data into the new tables.
I've been stuck on (4) and a way to do this effectively.
If I try to just 'add the index' manually to the partitioned table, due to race conditions or errors, I have to fix many duplicate records. Even though this should never happen, in a large enough environment, undiscovered bugs have caused zabbix to insert the same data multiple times into the history table. Fixing these is not easy.
a. running a query to discover them is impossible. The table is too big and it takes too long.
b. Fixing them one by one as you encounter them might be a fool's errand. There's no way of telling how many there will be and it might take really long to wait half an hour each time for the index building query to decide to barf on a duplicate row.
If I try to copy the database wholesale, I run into a mysql bug, and mysql will crash with a problem with low level file/memory operations. (Already filed this bug). It's too big.
So I've resolved to try to move the history data bit by bit with smaller queries. Let's say one hour at a time, from the most recent to older. I've tried this code;
Code:
USE zabbix;
DROP PROCEDURE IF EXISTS transferHistory;
DELIMITER $$
CREATE PROCEDURE transferHistory(
duration int = 8035200
)
-- Duration: How long your longest history table is in seconds.
-- use 31 * 86400 * [num months]. Default: 3.
BEGIN
SET @date = UNIX_TIMESTAMP();
SET @startDate = @date - duration;
WHILE @date > @startDate DO
SELECT FROM_UNIXTIME(@date);
START TRANSACTION;
INSERT INTO history SELECT * FROM history_old WHERE history_old.clock > @date;
DELETE FROM history_old WHERE history_old.clock > @date;
COMMIT;
SELECT 'history_str';
START TRANSACTION;
INSERT INTO history_str SELECT * FROM history_str_old WHERE history_str_old.clock > @date ;
DELETE FROM history_str_old WHERE history_str_old.clock > @dat ;
COMMIT ;
SELECT 'history_text';
START TRANSACTION ;
INSERT INTO history_text SELECT * FROM history_text_old WHERE history_text_old.clock > @date ;
DELETE FROM history_text_old WHERE history_text_old.clock > @date ;
COMMIT ;
SELECT 'history_uint';
START TRANSACTION;
INSERT INTO history_uint SELECT * FROM history_uint_old WHERE history_uint_old.clock > @date ;
DELETE FROM history_uint_old WHERE history_uint_old.clock > @date ;
COMMIT ;
SET @date = @date - 3600;
END WHILE;
END$$
DELIMITER ;
Clearly, this method is too slow if it'll take longer than it took to write the history data to transfer it.
Edit: It just did the first set in 1hr 34 min. So ETA so far is 4.5 months.
The interesting part is that history took that hour and a half. The others (_str, _text, _uint), ... less than a millisecond.
Are there SQL experts who could help writing faster code to get the poorly indexed old database' data into the new one? Or am I stuck just ditching 3 months of history (people will probably complain about that)?
Comment