Ad Widget

Collapse

Server Migration With Large History Tables - Can History Data Be Moved Separately?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • alexw-z
    Member
    • Dec 2021
    • 36

    #1

    Server Migration With Large History Tables - Can History Data Be Moved Separately?

    I'm about to do a platform move on a Zabbix 4.4 installation from an old Ubuntu 16 box to a new Alma 8 box (to then subsequently be upgraded to 6.0 in due course). I ideally need to complete the migration within an overnight maintenance window of ~6 hours, and at the moment the MySQL import is taking around 7.5 hours. I need to move approx 150GBytes of data, over 90% of which is History data.

    One of the avenues I'm exploring is whether I can move all the other tables during the transfer window, and then leave the history import running in the background as a separate import into the next day once the box is back online. Does anybody know if this is feasible to do?

    Incidentally f any has any other tips for speeding up DB migration I'd love to hear them too. I've already done some work to optimise MySQL (for example increasing the InnoDB buffer pool). I've also looked into using Percona Xtrabackup as an alternative to native MySQL tools, although it looks as though the latest version available for U16 has an unfixed bug that results in the export crashing on compressed tales.
  • medl
    Junior Member
    • Nov 2022
    • 9

    #2
    We had 1.8TB of History Data in a MariaDB and your Problem

    I configured a master -> slave replication and then switched to the new database, but that only works if you have matching Versions (and want to stay on MySQL).
    Maybe you want to install the same old version and then walk yourself thru the upgrade path.

    Another approach is to copy the tables (if you have file per table enabled), there are mechanisms to export/import a table using its files.
    But same case here - i think matching versions are key.

    Export / Import are the slowest way, there's a lot of tutorials on the web how to speed things up.
    In my case none of them yielded satisfying results.

    Please note: In 6.0 there are also optional modifications to the history tables.

    Comment

    • alexw-z
      Member
      • Dec 2021
      • 36

      #3
      Thanks for the reply. Yeh, all good on the versions/test environment work, both boxes will be running 4.4.10 at time of migration (with a newer version of MariaDB on the new box to satisfy 6.0 upgrade requirements. I've got a test lab copy of the old (and new) box which I've been using for testing, I've done one full walkthrough already and will be doing another before going ahead with the live process (as well as a barrage of other assorted testing).

      I haven't really looked into other DBs, my background is MySQL and I'm reluctant to add any more variables into the upgrade process, so happy to stick with MySQL for the time being.

      The old box was built and admined by a predecessor of mine who is no longer with the company, and config documentation is lacking, so my plan is to run old and new boxes in tandem for a little while in case anything nasty crop ups up during either update that I've been unable to account for during test run throughs.

      Presumably you'd still have needed to dump across a starting copy of the DB as part of setting up replication however?

      Comment

      • tim.mooney
        Senior Member
        • Dec 2012
        • 1427

        #4
        Originally posted by alexw-z
        Incidentally f any has any other tips for speeding up DB migration I'd love to hear them too.
        You've already mentioned that you increased the buffer pool, which is the most important thing for performant general operation.

        For just the migration (not for regular operation), have you (can you?) disable the binary log (SET SQL_LOG_BIN=0)? That should make some difference for the import, but I suspect it won't cut 1.5 hours off your current time.

        What you're considering with the history tables (there are multiple, though your site might have one that dominates the sizes of the others) should work. I would probably try split the history dumps into two: one "recent" that I import with the other stuff, and one the older history that gets imported after the maintenance window. I've never tried to do that for Zabbix history, though, so I don't know how easy it would be to do. It would involve using the '--where' option with mysqldump and the specific table to only select records matching some criteria.

        Good luck and let us know how it goes.

        Comment

        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4807

          #5
          I moved ~1T of history data from one db to another in one day pieces ... There is a column named "clock" in all history and trends tables, which has unix timestamp of each metric. So you could export data into "daily" files and later import those also one by one... And yes, you can do it later also. Import newest data first, add older later, even if its after your maintenance window...
          I have PG here, so there might be differences with mysql and its export / import methods..

          Comment


          • tim.mooney
            tim.mooney commented
            Editing a comment
            Thanks for adding that info cyber! I think your suggestions should be directly applicable to mysql too.

            I've never had to deal with that much history/history_uint, so there's a couple things I was unclear on. Maybe you can comment on either of them?

            1) is there a danger that if the separate history imports take too long (24 hours? a week? I'm not sure what "too long" would even be) it will mess up Zabbix's trend calculation? Would disabling the housekeeper until all imports are done be advisable, or is that not going to make any difference?

            2) Like your suggestion, my idea was to import the most recent data first, and then go back and import the older stuff. I suggested that because I have a vague recollection that some history data is kept in memory/cache, but I don't know how much, but I would certainly believe that it would be the most recent data. If alexw-z does history imports broken up by day, is there some process that needs to be done after the imports to populate or force an update of the in-memory history data? Or is that not needed at all?
        • cyber
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Dec 2006
          • 4807

          #6
          All of this is my experience and should not be taken as some kind of official info...

          IIRC trends are calculated continuously, you can find them already present after one hour after you collected data, I think. So history and trends transfer should not interfere with any trend calculations, as it is already done for that timeframe... In caches it keeps data that is needed for trigger calculations, so mostly very recent, if you don't have calculations over long period of time.

          If we compare history and trend table sizes, then trends are much smaller. For example, if I transferred data, then exporting/importing one days worth of trends took maybe 5 minutes, compared to history_uint, where it took ~40 minutes. But for a year of trends and couple of weeks of history, it is still pretty substantial amount of time... So, what to do to fit into small maintenance window??

          You can easily export/import all things until "today" from working instance, housekeeper does not really matter here, as it removes history/trend data (plus events etc, but those you transfer at one go in the end) from "far end" of the database... your expired data, older than your history/trends keeping period, so you can even start from lets say, 360 ago until "yesterday 23:59:59". That data should be pretty untouched by housekeeper, if you do not delete items/hosts etc at that time... But even that should not really matter, as housekeeper should find that orphaned data also and delete it later. as you see I started from 360 not 365 as usual trend period is, so there would be some grace period, until housekeeper wants to remove some data....

          This is exactly, what I did... I started day early. As I have PG cluster, then I used standby host (no load on that except it just replicates from active node and no write access needed for select) for selecting data and redirecting it directly to new DB... "psql -tc "COPY (select * from <table>where clock >xxxxxxx and clock <= yyyyyyy ) TO STDOUT DELIMITER ',' CSV" zabbix | psql postgresql://other-db:5432/zabbix -c "COPY <table>FROM STDIN CSV;"". Due to some circumstances even one day early was a bit too short time, so some of this did not get ready by our downtime and I finished it later, when my other server was already up and running (and upgraded also.. ) .. Thus -> "you can also do it later". It was mostly due to my ambitions to apply those primary keys already at this point to avoid same copy from one table to another later for just adding keys etc... But old db-s (like that 4.4) can have duplicate data (due to missing primary keys)... so it broke the copy in such cases, so I had to find and eliminate duplicates and retry....

          When real maintenance time arrived, I shut down that working server and transferred remaining data, basically "today" from trends/history and every other table, as they are very small compared to history and trends and can be transferred probably in 15-20 minutes... Only thing to keep in mind is, that in case of some tables, it has to be done in correct order... Like .. you cannot insert items before you have hosts inserted.. as items table references hosts and if its empty, then you get errors... Maybe someones sql-fu is better than mine and they can somehow overcome it.... I am no DBA.. Going mostly by gut feeling and stack overflow..

          ​​

          Comment

          • alexw-z
            Member
            • Dec 2021
            • 36

            #7
            Thanks again for the replies all. Taking all of the above into account I think separating the tables out will probably introduce more complexity/database hackiness into the migration plans that I'd prefer, so given that the business is happy that our history data is non-critical/not widely used, we're going to go down the route of dropping retention pre-transfer instead.

            Comment

            Working...