Ad Widget

Collapse

PostgreSQL 10 Native Range Partitioning on Clock Column for Zabbix 3.4+

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • doctorbal82
    Member
    • Oct 2016
    • 39

    #1

    PostgreSQL 10 Native Range Partitioning on Clock Column for Zabbix 3.4+

    I am currently running my Zabbix DB on PostgreSQL 9.4 but plan on upgrading to PostgreSQL 10 soon.

    PostgreSQL 10 comes with native partitioning. This is better than the inherit partitioning which will help avoid triggers & functions which caused issues in our database in the past.

    I am looking for the best way to automate the range partitioning on the clock column for the history and trend tables.

    We plan on partitioning the database beforehand and then dumping the data in there where all the data will reside in the partitions. I tested this manually with the SQL queries below.

    Would it be wiser to write a SQL function to perform the following below or to use a shell or python (python-psycopg2) script?

    Zabbix.org provides some partitioning for PostgreSQL but they are all trigger based; something we want to avoid.

    Ideally the program would find the oldest epoch/date value of the clock column and then create weekly (or monthly) partitions since that date. The script would run weekly and delete partitions that are older than 31 days for history and 365 days for trends.

    Any suggestions on how to get this done using the native range partitioning that comes packaged with PostgreSQL 10?

    Code:
    -- Drop the Zabbix public history table because we need to create a new one with partitions
    DROP TABLE if exists public.history;
    
    -- Create a ranged partitioned table
    CREATE TABLE public.history
    (
        itemid bigint NOT NULL,
        clock integer NOT NULL DEFAULT 0,
        value numeric(16,4) NOT NULL DEFAULT 0.0000,
        ns integer NOT NULL DEFAULT 0
    ) PARTITION BY RANGE (clock);
    
    GRANT ALL ON TABLE public.history TO postgres;
    
    -- Create the partitioned table. Each table needs the correct permissions.
    -- You need to constrain each leaf partition individually since primary keys are not supported in partitioning.
    
    CREATE TABLE public.history_y2017 PARTITION OF public.history
        FOR VALUES FROM (MINVALUE) TO (1514764800);
    GRANT ALL ON TABLE public.history_y2017 TO postgres;
    
    CREATE TABLE public.history_y2018m01 PARTITION OF public.history
        FOR VALUES FROM (1514764800) TO (1517443200);
    GRANT ALL ON TABLE public.history_y2018m01 TO postgres;
    
    CREATE TABLE public.history_y2018m02 PARTITION OF public.history
        FOR VALUES FROM (1517443200) TO (1519862400);
    GRANT ALL ON TABLE public.history_y2018m02 TO postgres;
    
    CREATE TABLE public.history_y2018m03 PARTITION OF public.history
        FOR VALUES FROM (1519862400) TO (1522540800);
    GRANT ALL ON TABLE public.history_y2018m02 TO postgres;
    
    CREATE TABLE public.history_y2018m04 PARTITION OF public.history
        FOR VALUES FROM (1522540800) TO (1525132800);
    GRANT ALL ON TABLE public.history_y2018m03 TO postgres;
    
    CREATE TABLE public.history_y2018m05 PARTITION OF public.history
        FOR VALUES FROM (1525132800) TO (1527811200);
    GRANT ALL ON TABLE public.history_y2018m04 TO postgres;
    
    -- Create an Index on each partition created
    CREATE INDEX ON public.history_y2017 USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m01 USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m02 USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m03 USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m04 USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m05 USING btree (itemid, clock);
    
    -- What about using date instead of epoch value such as 2017-01-01 00:00:00; would that be easier?
    Last edited by doctorbal82; 17-05-2018, 01:16.
  • doctorbal82
    Member
    • Oct 2016
    • 39

    #2
    cd The following below is the solution I came up with for transitioning to PSQL 10 Native Range Partitioning from a PSQL 9.4 populated database with no partitioning.

    A. Create a Zabbix Empty PSQL 10 Database.

    Ensure you first create an empty Zabbix PSQL 10 DB.

    Code:
    # su postgres
    postgres@<>:~$ createuser -P -s -e zabbix
    postgres@<>:~$ psql
    postgres# create database zabbix;
    postgres# grant all privileges on database zabbix to zabbix;
    B. Create tables & Native Range Partitions on clock column Create the tables in the Zabbix DB and implement native range partitioning for the clock column. Below is an example of a manual SQL script that can be fun for history table. Perform this for all the history tables you want to partition via range.

    Code:
    CREATE TABLE public.history
    (
        itemid bigint NOT NULL,
        clock integer NOT NULL DEFAULT 0,
        value numeric(20,0) NOT NULL DEFAULT (0)::numeric,
        ns integer NOT NULL DEFAULT 0
    ) PARTITION BY RANGE (clock);
    
    CREATE TABLE public.history_old PARTITION OF public.history
        FOR VALUES FROM (MINVALUE) TO (1522540800);
    CREATE TABLE public.history_y2018m04 PARTITION OF public.history
        FOR VALUES FROM (1522540800) TO (1525132800);
    CREATE TABLE public.history_y2018m05 PARTITION OF public.history
        FOR VALUES FROM (1525132800) TO (1527811200);
    
    CREATE INDEX ON public.history_old USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m04 USING btree (itemid, clock);
    CREATE INDEX ON public.history_y2018m05 USING btree (itemid, clock);
    C. Automate it!

    I used a shell script because it is one of the simplest ways to deal with creating new partitions in PSQL 10. Make sure you're always at least one partition ahead of what's needed.

    Let's call the script auto_history_tables_monthly.sh.

    On a Debian 8 Flavor OS that runs PSQL 10 ensure the script is in a certain directory (I used /usr/local/bin) with correct permissions (
    Code:
    [B]chown postgre:postgres /usr/local/bin/auto_history_tables_monthly.sh[/B]
    ) and make it executable (chmod u+x /usr/local/bin/auto_history_tables_monthly.sh as postgres user).

    Create a cron job (crontab -e) for postgres user with following:

    Code:
    0 0 1 * * /usr/local/bin/auto_history_tables_monthly.sh | psql -d zabbix
    This will run the shell script the first of every month.

    Below is the script. It uses the date command to leverage the UTC epoch value. It creates a database a month in advance and drops a partition 1 month late to adhere to the 31 days of retention for history that is customized to my needs. Ensure the PSQL 10 DB is on UTC time for this use case.

    Code:
    #!/bin/bash
    
    month_diff () {
            year=$1
      month=$2
            delta_month=$3
      x=$((12*$year+10#$month-1))
            x=$((x+$delta_month))
            ry=$((x/12))
            rm=$(((x % 12)+1))
            printf "%02d %02d\n" $ry $rm
    }
    
    month_start () {
            year=$1
      month=$2
            date '+%s' -d "$year-$month-01 00:00:00" -u
    }
    
    month_end () {
            year=$1
      month=$2
            month_start $(month_diff $year $month 1)
    }
    
    # Year using date
    current_year=$(date +%Y)
    current_month=$(date +%m)
    
    # Math
    next_date=$(month_diff $current_year $current_month 1)
    next_year=$(echo $next_date|sed 's/ .*//')
    next_month=$(echo $next_date|sed 's/.* //')
    
    start=$(month_start $next_date)
    end=$(month_end $next_date)
    
    #next_month_table="public.history_y${next_year}m${next_month}"
    
    # Create next month table for history, history_uint, history_str, history_log, history_text
    sql="
        CREATE TABLE IF NOT EXISTS public.history_y${next_year}m${next_month} PARTITION OF public.history
          FOR VALUES FROM ($start) TO ($end);
        \nCREATE TABLE IF NOT EXISTS public.history_uint_y${next_year}m${next_month} PARTITION OF public.history_uint
          FOR VALUES FROM ($start) TO ($end);
        \nCREATE TABLE IF NOT EXISTS public.history_str_y${next_year}m${next_month} PARTITION OF public.history_str
          FOR VALUES FROM ($start) TO ($end);
        \nCREATE TABLE IF NOT EXISTS public.history_log_y${next_year}m${next_month} PARTITION OF public.history_log
          FOR VALUES FROM ($start) TO ($end);
        \nCREATE TABLE IF NOT EXISTS public.history_text_y${next_year}m${next_month} PARTITION OF public.history_text
          FOR VALUES FROM ($start) TO ($end);
        \nCREATE INDEX on public.history_y${next_year}m${next_month} USING btree (itemid, clock);
        \nCREATE INDEX on public.history_uint_y${next_year}m${next_month} USING btree (itemid, clock);
        \nCREATE INDEX on public.history_str_y${next_year}m${next_month} USING btree (itemid, clock);
        \nCREATE INDEX on public.history_log_y${next_year}m${next_month} USING btree (itemid, clock);
        \nCREATE INDEX on public.history_text_y${next_year}m${next_month} USING btree (itemid, clock);
        "
    
    echo -e $sql
    
    # Math
    prev_date=$(month_diff $current_year $current_month -2)
    prev_year=$(echo $prev_date|sed 's/ .*//')
    prev_month=$(echo $prev_date|sed 's/.* //')
    
    # Drop last month table for history, history_uint, history_str, history_log, history_text
    sql="
        DROP TABLE public.history_y${prev_year}m${prev_month};
        \nDROP TABLE public.history_uint_y${prev_year}m${prev_month};
        \nDROP TABLE public.history_str_y${prev_year}m${prev_month};
        \nDROP TABLE public.history_log_y${prev_year}m${prev_month};
        \nDROP TABLE public.history_text_y${prev_year}m${prev_month};
        "
    
    echo -e $sql
    D. Then dump the data from the old database within. I used pg_dump/pg_restore.

    I'm sure there are more complex solutions out there but I found this to be simplest for the needs of autopartitioning the Zabbix Database using the PostgreSQL 10 Native Range Partitioning functionality.

    Please let me know if you need more details.
    Last edited by doctorbal82; 15-11-2018, 22:36.

    Comment

    • doctorbal82
      Member
      • Oct 2016
      • 39

      #3
      The shell script has been updated to include the rest of the history partitions that I want partitioned. This works correctly now. It also deletes the records older than 2 months instead of 1 which IMO is safer for a 31 day retention period set in the Administration | General | Housekeeping tab.

      Comment

      • TheSummer69
        Junior Member
        • Aug 2018
        • 1

        #4
        Thanks for sending me in the right direction , but i'm experiencing trouble because i haven't initially created the august partition , so now every time i get :

        Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: no partition of relation "history" found for row
        DETAIL: Partition key of the failing row contains (clock) = (1535210622).
        [insert into history (itemid,clock,ns,value) values (23262,1535210622,957377843,0.016943);

        Any way to quicky fix this with the right EPOCH timestamps?

        Comment

        • doctorbal82
          Member
          • Oct 2016
          • 39

          #5
          Originally posted by TheSummer69
          Thanks for sending me in the right direction , but i'm experiencing trouble because i haven't initially created the august partition , so now every time i get :

          Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: no partition of relation "history" found for row
          DETAIL: Partition key of the failing row contains (clock) = (1535210622).
          [insert into history (itemid,clock,ns,value) values (23262,1535210622,957377843,0.016943);

          Any way to quicky fix this with the right EPOCH timestamps?
          To quickly fix this I would suggest updating the manual SQL entries (mentioned again below) by adding the CREATE TABLE and CREATE INDEX based for each month using the https://www.epochconverter.com/.

          You need to take the epoch timestamp from the 1st of each month; e.g. 1st of August, 2018 00:00:00 through 1st of September, 2018 00:00:00.

          Here is a screenshot example.

          Click image for larger version

Name:	epoch.png
Views:	2384
Size:	51.7 KB
ID:	365635

          Below is the SQL query examples for June 2018, July 2018, August 2018 and September 2018:

          Code:
          CREATE TABLE public.history_y2018m06 PARTITION OF public.history FOR VALUES FROM (1527811200) TO (1530403200);
          CREATE TABLE public.history_y2018m07 PARTITION OF public.history FOR VALUES FROM (1530403200) TO (1533081600);
          CREATE TABLE public.history_y2018m08 PARTITION OF public.history FOR VALUES FROM (1533081600) TO (1535760000);
          CREATE TABLE public.history_y2018m09 PARTITION OF public.history FOR VALUES FROM (1535760000) TO (1538352000);
          Then create the INDEXES

          Code:
          CREATE INDEX on public.history_y2018m06 USING btree (itemid, clock);
          CREATE INDEX on public.history_y2018m07 USING btree (itemid, clock);
          CREATE INDEX on public.history_y2018m08 USING btree (itemid, clock);
          CREATE INDEX on public.history_y2018m09 USING btree (itemid, clock);

          I'm sure there is a simpler way to script this together but at the moment that's what I would suggest. I'll look into a simpler means for the future.

          Comment

          • sfl
            Junior Member
            • Jun 2016
            • 26

            #6
            Hi doctorbal82,

            I am in same situation, migrating from PSQL 9.5.14 to PSQL 10.5 and from Zabbix 3.4.14 to 4.0.
            I had create partition history table but when I restore dump by pg_restore (pg_restore -j 5 -d zabbix db.dump) data doesn't go into partitions.

            Code:
            postgres=# \d+
                                          List of relations
             Schema |         Name          | Type  |  Owner   |    Size    | Description
            --------+-----------------------+-------+----------+------------+-------------
             public | history               | table | postgres | 0 bytes    |
             public | history_old           | table | postgres | 0 bytes    |
             public | history_str           | table | postgres | 0 bytes    |
             public | history_str_old       | table | postgres | 0 bytes    |
             public | history_str_y2018m09  | table | postgres | 0 bytes    |
             public | history_str_y2018m10  | table | postgres | 0 bytes    |
             public | history_text          | table | postgres | 8192 bytes |
             public | history_text_old      | table | postgres | 8192 bytes |
             public | history_text_y2018m09 | table | postgres | 8192 bytes |
             public | history_text_y2018m10 | table | postgres | 8192 bytes |
             public | history_uint          | table | postgres | 0 bytes    |
             public | history_uint_old      | table | postgres | 0 bytes    |
             public | history_uint_y2018m09 | table | postgres | 0 bytes    |
             public | history_uint_y2018m10 | table | postgres | 0 bytes    |
             public | history_y2018m09      | table | postgres | 0 bytes    |
             public | history_y2018m10      | table | postgres | 0 bytes    |
            (16 rows)
            
            postgres=# SELECT * FROM public.history_old;
             itemid | clock | value | ns
            --------+-------+-------+----
            (0 rows)
            
            postgres=# SELECT * FROM public.history_y2018m09;
             itemid | clock | value | ns
            --------+-------+-------+----
            (0 rows)
            
            postgres=# SELECT * FROM public.history_y2018m10;
             itemid | clock | value | ns
            --------+-------+-------+----
            (0 rows)




            [/CODE]

            I notice two things :
            1/ When I am connected to zabbix database I don' t see any partition previously created. Partition are only in postgres database, is it normal ?
            2/ Partition owner is postgres. Maybe zabbix user does not have sufficient privilege ?

            Can you please send me in the right direction ?

            Many thanks in advance,
            Sfl

            Comment


            • doctorbal82
              doctorbal82 commented
              Editing a comment
              Hello @sfl,

              The best practice when using PSQL 10 native range partitioning is to simply dump the partitioned data into the a new Zabbix database without the Zabbix schema deployed in the database. Simply run "pg_restore -d zabbix /var/backups/postgresql/zabbix.dump" as the postgres user for the dump file. That is what worked for me in the past.

              I used "pg_dump -h <hostname> -d zabbix -Fc > /var/backups/postgresql/zabbix.dump".

              I had no issues with permissions but I could be naive and wrong in the future with other upgrades.

              Let me know if this information helps!

              Best Regards.
          • sfl
            Junior Member
            • Jun 2016
            • 26

            #7
            Hi @doctorbal82,

            DB partitionning was successfull at second try. I was not in zabbix db at the first time.
            Data import and ZBX upgrade to 4.0.1 is OK.

            I still have an issue when I want to delete item prototype on a host or templates. PSQL process crash with segfault and rollback the transaction.
            I have done a specific post here :


            Any idea is very apreciate,
            Sfl

            Comment


            • doctorbal82
              doctorbal82 commented
              Editing a comment
              Thanks for the update on your end sfl.

              I have not upgrade to Zabbix 4.0 yet. I am running 3.4.15 currently with Native Range Partitioning on history* and trends*. Ensure you have Indexes on all tables as I realized my script didn't include that originally.

              I will explore the PostgreSQL process crash separately as I currently do not know the reason and have not experienced this myself.

              Best Regards,
              Andreas

            • doctorbal82
              doctorbal82 commented
              Editing a comment
              Sfl,

              I noticed in your post (https://www.zabbix.com/forum/zabbix-...item-prototype) that you are using PSQL version 11.

              There are some improvements on that PostgreSQL version for partitioning - https://www.postgresql.org/docs/curr...titioning.html.

              This includes *automatic index creation on parent tables* (PSQL 10 script performs them manually so there might be more indexes there that you don't need), *primary key support with constraints on master tables* and other feature improvements.

              I'm not sure how this impacts the script as I have not tested this yet but would explore how this works optimally with PSQL 11+ with Zabbix and test in a Development environment before deploying in your Production Zabbix instance.
              Last edited by doctorbal82; 24-11-2018, 00:23.
          • doctorbal82
            Member
            • Oct 2016
            • 39

            #8
            I am currently testing partitioning using the PostgreSQL 11 Native Partitioning improvements (https://www.postgresql.org/about/news/1894/) that have come with the release.

            I have also been testing pg_partman (https://github.com/pgpartman/pg_partman) to do the "heavy lifting" of maintenance of the partitions instead of the hacked together script above. It is a well maintained project and has PSQL 11 support with version 4.0.0.

            With PostgreSQL 11 there have been significant improvements in partitioning:

            * Automatic index creation on partitioned tables as long as the parent table has an index.
            * Default partitioned table creation to place data that doesn't fit within a particular partitioned time range.
            * Primary Key support; this helped the trend tables I wanted to partition instead of having to drop the CONTRAINT that was done in PostgreSQL version 10, which didn't support Primary Keys in parent partitions.
            * Improved algorithms for performance of partitioning, thus improved speed.

            I am also exploring the usage of BRIN (Block Range Indexes) over the default Btree on the history* tables to potentially save a ton of space and improve performance when data is naturally ordered such as the use case in time series data logging.

            I'll upload the notes once properly tested; it takes usually a month with heavy I/O processing to see stability.

            Comment


            • r.monakhov
              r.monakhov commented
              Editing a comment
              Hi, doctorbal82! Have you tested native range partitioning on PostgreSQL 11 yet? I've faced with PostgreSQL 11 after MariaDB and started to deploy new large installation (approximately 5000+ NVPS).

            • doctorbal82
              doctorbal82 commented
              Editing a comment
              Hello r.monakhov, yes I have. I will upload my notes to github and share soon!
          • doctorbal82
            Member
            • Oct 2016
            • 39

            #9
            I uploaded my notes to github here - https://github.com/Doctorbal/zabbix-...s-partitioning.

            r.monakhov I have implemented this in our production instance for over 3 months now. It works very well. With Zabbix 4.2 there is TimeScaleDB which offers time based partitioning in PostgreSQL but I liked pgpartman at the time as we use it on Zabbix version 3.4 and soon to upgrade to 4.0.

            Let me know if any questions! I'm always keen to hear where I missed a detail.

            Comment


            • r.monakhov
              r.monakhov commented
              Editing a comment
              Thank you for your work. I'll try to use partman with your information from github soon.
          Working...