Ad Widget

Collapse

Most efficient way to backup the Zabbix Database

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • rcollier
    Member
    • Sep 2013
    • 53

    #1

    Most efficient way to backup the Zabbix Database

    Hello all, happy Monday!

    I am currently running zabbix 2.2.1 on a CENTOS 6.4 server with approximately 25.2 values per second (really small environment).

    As of right now my zabbix database is backed up every night by using automysqlbackup + cron. Over the past few nights I've seen a consistent stream of false positive triggers in regards to the zabbix agent ping triggers. During the backup window, the agents are unable to report to zabbix due to the database being locked while the backup is performed.

    Is there any way around this issue? Or, should I create a maintenance period that includes all hosts while the database is backed up?
  • ArtemK
    Senior Member
    • May 2013
    • 232

    #2
    you can try "non-locking" xtrabackup from percona, also it make sense to skip history tables during backup, if it's possible in your environment.

    Comment

    • rcollier
      Member
      • Sep 2013
      • 53

      #3
      Thanks for your reply.

      If you don't mind me asking, why does it make sense to skip the history tables during backup. Wouldn't you want that information during a restore?

      Comment

      • kloczek
        Senior Member
        • Jun 2006
        • 1771

        #4
        Originally posted by ArtemK
        you can try "non-locking" xtrabackup from percona, also it make sense to skip history tables during backup, if it's possible in your environment.
        One of the method of doing backup with locking all tables is doing this on slave DB engine. Second method is lock tables with flushing buffers -> make snapshot - > unlock tables -> mount snapshot -> make binary backup or start yet another mysqld using mounted snapshot -> make full text dump.
        Exact method of doing such things depends on complexity of the env.

        If host with DB is busy enough high that backup cannot be done by using (for example on Linux) LVM snapshot and start on top of this another mysqld only way is doing backup on slave.
        http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
        https://kloczek.wordpress.com/
        zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
        My zabbix templates https://github.com/kloczek/zabbix-templates

        Comment

        • ArtemK
          Senior Member
          • May 2013
          • 232

          #5
          well, it depends on your business, for my company it's not critical, that in case of failure there are no history in monitoring system. But all templates, hosts and configured checks - that is critical. Definitely, history information could be critical for some another business.

          In some cases, it just not worth it, history tables takes too much space (they are biggest in zabbix database). Check that article, it could be useful.

          Comment

          • JvGinkel
            Junior Member
            • Jan 2014
            • 10

            #6
            Do you have Zabbix installed on a virtual machine? If so, you can also choose to make snapshots of the virtual machine, hence backing up everything including the database.

            Comment

            • steveboyson
              Senior Member
              • Jul 2013
              • 582

              #7
              I would vote for LVM snapshots. That works on physical servers as well. But I have to admit that I've never used it so far.

              Has someone any knowledge on that?

              Comment

              • kloczek
                Senior Member
                • Jun 2006
                • 1771

                #8
                Originally posted by steveboyson
                I would vote for LVM snapshots. That works on physical servers as well. But I have to admit that I've never used it so far.

                Has someone any knowledge on that?
                It is not a matter of voting. First LVM snapshot adds about 30% of IOs more.
                Try to create snapshot and observe the number of IOs.
                If before making snapshot you will have DB engine almost saturating max IO/s you can even vote for use LVM and your snapshot will kill your DB backend.

                Using snapshot technique is good with ZFS because no matter how many snapshots you have it does not have any impact on IOs.

                DB engine used by zabbix mostly (95-98%) is doing write IOs. Making backup adds read IOs. To make backup you must be able to make IO=X*(1+~0.3)+Y io/s on storage used by your DB engine.
                X - number of IOs during normal work of DB backed
                Y - number of read IOs generated by backup
                IO - total number of IOs which must be below your max io/s which your system can do on your physical storage.
                For example two mirrored 15k rpm/s disks can do max about 400 io/s.

                For example:


                Above is ~two day graph of IOs on slave DB backend used by zabbix.
                Every day housekeeping history* table content is done by rotating tables partitions and just after midnight mysqld starts using fresh files. With constant stream of new data effectively mysqld with growing size of DB files (innodb engine) is making more and more IOs during the day because with growing size of DB files avg number of IOs is spent on writing the same number of inserted data making it increase.
                So, just after midnight in my case (just after slave starts writing data to fresh history* partitions) is the best moment to make backup because of large number of IOs which can be used to make the backup.
                Backup is done by using the following commands: stop mysqld -> sync -> lvm snapshot -> start mysqld -> mount lvm snapshot -> tar mysql binary files (without bin relay files and index) -> umount lvm snapshot -> remove lvm snapshot.
                Each of these two green bars on the above graph are used to read the IOs while performing the backup. I'm able to make these backups without any hassles because:
                - I'm doing backup process with the use of tar with compression which limits reading bandwidth of archived data.
                - cumulative number of IOs on this storage is below the maximum IO bandwidth (which is about 400 IO/s)

                Writing to backup file is not indicated on this graph because the backup file is not stored on /dev/sda.

                looking precisely on the length of the red area above the green areas and comparing it with how many write IOs is on both sides of green area, you will see that the number of write IOs is bigger by about 30% while performing the backup.
                Why? LVM snapshot overhead.

                Conclusion: LVM on Linux has some serious limitations, and when you use it, you must be fully aware of these limitations.
                In the case above of making a backup on the master DB engine, would kill completely zabbix during the backup window.
                Last edited by kloczek; 02-02-2014, 02:56.
                http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                https://kloczek.wordpress.com/
                zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                My zabbix templates https://github.com/kloczek/zabbix-templates

                Comment

                • steveboyson
                  Senior Member
                  • Jul 2013
                  • 582

                  #9
                  Thank you for pointing out! It is very good to get some practical considerations on using LVM with Zabbix.

                  Comment

                  Working...