Ad Widget

Collapse

ibdata is well over 1 terabyte - need suggestions

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • qZDnIhyEX
    Junior Member
    • Dec 2020
    • 7

    #1

    ibdata is well over 1 terabyte - need suggestions

    Zabbix 4.4
    MariaDB 5.5
    RHEL 7

    This Zabbix installation pre-dates my time here by several years. From what I can tell, every time storage ran dry more storage was thrown at it until now we're well over 1 terabyte of storage and it's growing 50GB / month. And this is only 1 of our installations.

    There are posts out there, but they are 5-10 years old and since my experience with Zabbix is minimal I wanted to get updated recommendations on how to correct this.

    Suggestions on how best to move forward?
  • tim.mooney
    Senior Member
    • Dec 2012
    • 1427

    #2
    Sorry you inherited a mess.

    My recollection was that the Zabbix database should use "innodb_file_per_table" for MySQL/MariaDB, but I don't see that mentioned in the current database instructions, so perhaps I'm mistaken about that. If all your tables and rows are in ibdata, then you're not using file-per-table, so OPTIMIZE operations aren't (easily) going to be able to reduce the on-disk footprint. As with most relational databases, even when you delete records from the database, the space of the files on disk don't immediately shrink; the database keeps the freed space for future growth. You would need to initiate a cleanup routine (usually: OPTIMIZE) to reclaim the on-disk space from a table that has lots of free space.

    I assume you've used a query something like this to identify which tables are taking up the most space:

    Code:
    SELECT table_name "Table Name", table_rows "Row Count", round(((data_length + index_length)/1024/1024),2) "Table Size (MiB)"
    FROM
        information_schema.TABLES
    WHERE
        table_schema = "zabbix";
    Are your largest tables 'history', 'history_uint', and possibly other history_related tables? That's typical for a Zabbix installation. 'trends' and 'trends_uint' might have a lot of rows too, but those are generally smaller.

    The history-related tables track all of the historical values for your items, up to whatever lifetime you've set (per item). They are a big part of the formula that's used to estimate the amount of space required for your database. See the documentation for history and trends and the linked database sizing information. Understanding what's described in the sizing docs is pretty important to understand how things should work.

    For a typical install, by which I mean one that is not using database partitioning for the history data, the "housekeeper" is responsible for deleting historical values from the history* table(s) after the value has passed the history lifetime that's set for the item. Housekeeping should eventually cause your Zabbix install to get to a steady-state for database size, as old values being cleaned up frees enough space for new values being added. This only happens some time after the longest time period set for history for any item. Again, this assumes a typical install, no database partitioning, and the housekeeper being run to do periodic expiry of old values.

    Larger installations with a higher number for new values per second (aka: NVPS) will often use an alternate config for history that relies on database partitioning. This reduces contention on the history* table(s) for both INSERT and DELETE and generally obviates the need for the housekeeper.

    With all of that background in place, your options are pretty clear.
    1. Keep only as much history as you need. Reduce the "History storage period" setting for any items that are being kept longer than they need to be. This is the #1 way you stop unbounded database growth: stop keeping more historical data than you absolutely need. Note that it's possible to set a global maximum override for history storage period (see the docs for history and trends, linked above), but I would generally recommend fixing the items themselves.
    2. Once you've reduced the "History storage period" for as many items as you reasonably can, for an install that is not using database partitioning, you need to make certain that the housekeeper is running and is doing its job. Note that if you drastically reduce the history period for a lot of items in short order, the housekeeper may be very busy for a while, as suddenly there will be a lot of rows for it to delete. You should be seeing periodic messages in your zabbix_server.log indicating that the "housekeeper" is running.
    That should stop the growth, but it won't do anything to reduce the space that's already been allocated on disk. For that, you would need to find a way to rebuild your tables (again, OPTIMIZE), but that's greatly complicated by not using file-per-table. A full dump/drop database/create database/reload would do the same thing, but that's a large undertaking, especially with a database that's your size.

    You did a good job of providing software versions for the important components (better than 90% of the people that ask questions on these forums...), but since your question relates specifically to the database, it might also be helpful to know how many items and triggers you have, and what your new values per second (NVPS) is. This might help anyone else that follows up with suggestions for how to improve your database situation for your sized environment.

    Comment

    • qZDnIhyEX
      Junior Member
      • Dec 2020
      • 7

      #3
      Tim, thank you for the detailed information. Come Monday, I'm going to look into the things you have mentioned. Because growth has been constant, I think it's safe to assume housekeeper is not running correctly or not turned on. I'll have to look into that.

      Comment

      • qZDnIhyEX
        Junior Member
        • Dec 2020
        • 7

        #4
        Housekeeper is set at 365d, but the history_uint and history tables are what's eating up the space. They are roughly at: history_uint = 800GB and history = 200GB.

        How can I verify Housekeeper is actually working?

        Comment

        • qZDnIhyEX
          Junior Member
          • Dec 2020
          • 7

          #5
          Manually querying the history table shows the oldest timestamp is from June of 2020, so it appears housekeeper is working.

          Hosts: 2000
          Items: 150,000
          Triggers: 40,000
          NVPS: 1800
          Last edited by qZDnIhyEX; 08-12-2020, 20:27.

          Comment

          • tim.mooney
            Senior Member
            • Dec 2012
            • 1427

            #6
            Originally posted by qZDnIhyEX
            the history_uint and history tables are what's eating up the space. They are roughly at: history_uint = 800GB and history = 200GB.
            I would definitely expect those to be the largest tables, so no surprises there.

            Based on your later post, you might want to plug those numbers into the sizing formula I linked earlier, and see if the sizing prediction matches your size.

            It's definitely possible that everything is working as expected. If that's the case, the only way to reduce space requirements is to reduce your history storage period for at least some of your items.

            Comment

            • yurtesen
              Senior Member
              • Aug 2008
              • 130

              #7
              You may be collecting too much data too frequently with too long history. Even so, there must be something causing it to rise 50GB a month becuase it should not rise forever.

              There are several things you can do.
              1- Increase item update intervals where you can.https://www.zabbix.com/documentation...#configuration
              2- Use `Discard unchanged` on data which does not change often and collected frequently https://www.zabbix.com/documentation.../preprocessing
              3- Use override history period to reduce history to less days for al items https://www.zabbix.com/documentation...al#housekeeper
              You say your history items goes as long as June of 2020. It is 6 months! The default for most items is normally 1 month. Read:

              The general strong advice is to keep history for the smallest possible number of days and that way not to overload the database with lots of historical values.

              Instead of keeping a long history, you can keep longer data of trends. For example, you could keep history for 14 days and trends for 5 years.

              You can get a good idea of how much space is required by history versus trends data by referring to the database sizing page.

              While keeping shorter history, you will still be able to review older data in graphs, as graphs will use trend values for displaying older data.
              Even so...innodb won't get smaller.`innodb_file_per_table` is the default in new MySQL installations and it is a good option to have in these situations. But you will never get it to shrink, especially the ibdata, you need full export/import on MySQL AFAIK. (if you were using PostgreSQL you could vacuum it)

              Comment

              • qZDnIhyEX
                Junior Member
                • Dec 2020
                • 7

                #8
                Playing around with NVPS and the size requirements documentation you sent, it might be possible that we are finally hitting our max DB size. Provided we do not add any more items or increase our history period. Unfortunately, I don't see us being able to significantly decrease the total DB size.

                I looked at every item attached to a template, and the vast majority were set to 1 or 2 weeks. Only a handful had more than 90 days. The 365 days value was in the main configuration. Going to play with numbers more tomorrow.

                I’m going to see if the “discard unchanged” will help us also.

                Many thanks to both of you for detailed assistance. It’s already helped me better understand what’s going on here.

                Comment

                Working...