Ad Widget

Collapse

Table Partitioning on Zabbix 2.2

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • yuusou
    Junior Member
    • Feb 2014
    • 24

    #1

    Table Partitioning on Zabbix 2.2

    Has anyone tried table partitioning on Zabbix 2.2? If so, could you write me a quick rundown on how to do it?

    I done a more-or-less clean install (export then delete all hosts, clear history trends acknowledges, dump drop then recreate database, re-add hosts) when 2.2.2 RPMs came out, and this is the size of my database since:
    Code:
    #$ sudo find /var/lib/mysql/zabbix/ -type f -size +100M -exec ls -lh {} \; | awk '{print $5"\t"$9}'
    660M	/var/lib/mysql/zabbix/history_text.ibd
    27G	/var/lib/mysql/zabbix/history_uint.ibd
    1,1G	/var/lib/mysql/zabbix/history.ibd
    912M	/var/lib/mysql/zabbix/history_str.ibd
    904M	/var/lib/mysql/zabbix/trends_uint.ibd
    I can't really cut back on the amount of data I'm fetching. Housekeeping is simply impossible. I want to do daily partitioning on the history* tables and monthly on trends* tables.

    I can make a script myself, no problem (and share it with whoever wants it). I'm more worried about all the foreign keys.
  • steveboyson
    Senior Member
    • Jul 2013
    • 582

    #2
    Hmmm, ya, that looks familiar to our site.

    I would be interested as well in a more detailed setup description for mysql partitioning.

    Until now we came to the conclusion that nothing is as boring as old data. So we decided to have maximum 90 days of history and 180 days of trends with a minimum interval of 60s (only the real real critical metrics) while the biggest number of items have an interval of 120 or 300 seconds.

    I know that this does not help you much but at least it keeps our DB at a resonable size.

    Partitioning is worth a try and I - not being a DB expert - would really like to see a good "hands on" example.

    P.S. Housekeeping:
    Our HK runs hourly and takes between 2 and 3 minutes, so for us it is reasonable. But we are still at 2.0.10 in our main zabbix site. How long is your housekeeper's duration?
    Last edited by steveboyson; 04-03-2014, 10:06. Reason: housekeeper

    Comment

    • yuusou
      Junior Member
      • Feb 2014
      • 24

      #3
      When it starts, it never stops.

      I'm using Percona XtraDB Cluster 5.6 for a Zabbix High Availability setup. The problem is the housekeeping does so many transactions that it just grinds the replication to a halt.

      To summarize my setup, I've got Percona on two servers (and a third as an arbitrator) and Zabbix is also on these servers. If Percona or the server itself goes down, Zabbix starts on the second server. Even after defining a well balanced flow control and queue in Percona, it just can't keep up when housekeeping kicks in.

      Comment

      • steveboyson
        Senior Member
        • Jul 2013
        • 582

        #4
        Uuuh. Does not sound very good.

        What is your current "new values per second" count?
        If it is not that high I would suspect a slow storage subsystem - could that be?

        We are running zabbix on ESX 5.5 with Web server, Zabbix and DB on the same host. While HK is active it consumes usually less than 1000 IOPs on the storage and we see no performance degrading at all. But it is not clustered on the application level, just the storage is redundant.

        Comment

        • yuusou
          Junior Member
          • Feb 2014
          • 24

          #5
          I have no performance issues with housekeeping switched off. My NVPS is currently at almost 400, but will grow exponentially (haven't added in my network devices).

          I'm running everything on a single host and it works fine. Replication to the second host and HK is what kills the service, but it's not network issued because it's not generating much traffic. It's just it seems HK deletes one row at a time (in batches 500, defined in config file), meaning this has to be replicated to the second server, one row at a time.

          This is where my idea for partitioning comes in. I partition the tables based on dates once a day and delete any partition older than 30 days. I'm worried about how to do this and not have issues with keys.
          Last edited by yuusou; 04-03-2014, 11:14.

          Comment

          • steveboyson
            Senior Member
            • Jul 2013
            • 582

            #6
            Oh, now I got it: you have a replicated SQL database (on the DB level).

            We did some testing with block based replication via DRBD and heartbeat (and now: corosync and pacemaker).

            So far it looks promising but the learning curve is quite steep

            Comment

            • yuusou
              Junior Member
              • Feb 2014
              • 24

              #7
              Ahh right! Sorry, I'm not very good at explaining myself.

              I have two hosts which run Percona XtraDB 5.6 and Zabbix 2.2.2. I have a third host as an Arbitrator because Percona requires a quorum. Even though Percona XtraDB is multi-master, I'm using it solely for failover.

              I'm running Keepalived for the Zabbix failover. When Percona dies (or the server itself dies), Zabbix server stops on the first host and starts on the second host.

              Zabbix is our eyes and ears, so it's critical that it's constantly on.

              I've currently set Houskeeping to every 24h and MaxHK to 50 if it becomes more manageable.

              The issues I have with HK on are mainly really slow web gui when either reading or writing to the database. The requests get queued for a really long time.
              Last edited by yuusou; 04-03-2014, 11:51.

              Comment

              • steveboyson
                Senior Member
                • Jul 2013
                • 582

                #8
                You are aware the larger the interval for HK is, the bigger is the number of rows to delete.

                Comment

                • yuusou
                  Junior Member
                  • Feb 2014
                  • 24

                  #9
                  I am aware, but I've tried smaller intervals to no avail. My current idea was to have the housekeeping running late at night. I guess I'll put in a cronjob to restart Zabbix in the middle of the night to force an initial housekeeping and see how it behaves the next night as well, after an initial clean up.

                  Comment

                  • nms_user
                    Member
                    • Feb 2009
                    • 43

                    #10
                    Hello,

                    We are on 2.0.x yet with MySQL(Percona) and table partitioning, and are planning to upgrade to 2.2.x soon.

                    I got no other informations than that it is possible like in 2.0, too. So we are intending to use it further. Partitioning simply does it's job - fast and reliable. Much better than this Housekeeping-IO-beast ;-)

                    Regards

                    Comment

                    • yuusou
                      Junior Member
                      • Feb 2014
                      • 24

                      #11
                      Originally posted by nms_user
                      Hello,

                      We are on 2.0.x yet with MySQL(Percona) and table partitioning, and are planning to upgrade to 2.2.x soon.

                      I got no other informations than that it is possible like in 2.0, too. So we are intending to use it further. Partitioning simply does it's job - fast and reliable. Much better than this Housekeeping-IO-beast ;-)

                      Regards
                      Could you possibly write me a guide or show me the guide you followed for partitioning, even though you're using 2.0? I know I read somewhere that the Zabbix devs keep try to keep their tables consistent throughout versions as much as possible.

                      Comment

                      • aib
                        Senior Member
                        • Jan 2014
                        • 1615

                        #12
                        Originally posted by yuusou
                        When it starts, it never stops.
                        Do you mind to try some different settings for Housekeeper?
                        Like I have in my configuration:

                        Code:
                        HousekeepingFrequency=1
                        MaxHousekeeperDelete=100
                        Sincerely yours,
                        Aleksey

                        Comment

                        • yuusou
                          Junior Member
                          • Feb 2014
                          • 24

                          #13
                          That's what I had before with 2.2.1. It just meant I had a constant terribly slow platform.

                          I had forgotten to turn on HK for my cronjob, but I manually turned it on and restarted it late in the evening. It took _four hours_ to complete (max delete at 50).

                          Though technically it was the first run with a lot of historical data, so I'll be restarting the server again to force HK and see how long it takes. This time, with my cronjob.

                          Comment

                          • nms_user
                            Member
                            • Feb 2009
                            • 43

                            #14
                            Originally posted by yuusou
                            Could you possibly write me a guide or show me the guide you followed for partitioning, even though you're using 2.0? I know I read somewhere that the Zabbix devs keep try to keep their tables consistent throughout versions as much as possible.
                            We have partitioning running since the 1.8.x days, and I followed this guides: Here http://zabbixzone.com/zabbix/partitioning-tables/ and here http://linuxnotes.us/archives/503

                            With the upgrade to 2.0, I had to adjust the script (thus the stored procs and the database itself). No big change, just removing partitioning from the tables affected by foreign keys (acknowledges, alerts, auditlog, events, service_alarms). Look at the replies of the above mentioned howto.

                            I actually don't know about db schema changes in 2.2 which would prohibit the usage of partitioning like in 2.0.

                            To all the readers: Are you aware of a showstopper?

                            Regards
                            Last edited by nms_user; 06-03-2014, 10:54.

                            Comment

                            • yuusou
                              Junior Member
                              • Feb 2014
                              • 24

                              #15
                              I also just came to the realization that, if your timer for housekeeping is 12 hours, it'll start 12 hours after the previous housekeeping job finished. That's even more inconvenient if it takes 4 hours to complete.

                              Comment

                              Working...