Ad Widget

Collapse

Suboptimal history deletion

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • EvilShadow
    Junior Member
    • Mar 2007
    • 13

    #1

    Suboptimal history deletion

    Hello all.
    Two days ago I've removed couple of hosts inherited from 'linux host' template, thus having hundred of items each. Next day I noticed poor database performance. On taking a look at pg_stat_activity i've noticed that zabbix still removes history for removed items. It does it this way:
    delete from history where oid in (select oid from history where itemid=23511 limit 500);
    Which is far from optimal, as can be seen:
    zabbix=# explain delete from history where oid in (select oid from history where itemid=23511 limit 500); QUERY PLAN
    ---------------------------------------------------------------------------------------------------
    Hash Semi Join (cost=1937.10..441472.82 rows=200 width=6)
    Hash Cond: (public.history.oid = public.history.oid)
    -> Seq Scan on history (cost=0.00..394651.26 rows=17095326 width=10)
    -> Hash (cost=1930.85..1930.85 rows=500 width=4)
    -> Limit (cost=0.00..1925.85 rows=500 width=4)
    -> Index Scan using history_1 on history (cost=0.00..94748.08 rows=24599 width=4)
    Index Cond: (itemid = 23511)
    (7 rows)
    I've had history table with 27 mln rows in it. Sequence scan in the table of 27e6 rows again and again for each item with subquery is really very bad idea. As can be seen, plain delete is much better:
    zabbix=# explain delete from history where itemid=23511;
    QUERY PLAN
    ------------------------------------------------------------------------------
    Bitmap Heap Scan on history (cost=529.22..71520.79 rows=24599 width=6)
    Recheck Cond: (itemid = 23511)
    -> Bitmap Index Scan on history_1 (cost=0.00..523.07 rows=24599 width=0)
    Index Cond: (itemid = 23511)
    (4 rows)
    And batch removal of old items is:
    zabbix=# explain delete from history where itemid not in (select itemid from items);
    QUERY PLAN
    --------------------------------------------------------------------
    Seq Scan on history (cost=420.50..437810.08 rows=8547663 width=6)
    Filter: (NOT (hashed SubPlan 1))
    SubPlan 1
    -> Seq Scan on items (cost=0.00..417.60 rows=1160 width=8)
    (4 rows)
    I understand that such deletion is made to not lock table by the long removal process. But Postgres is not MyISAM with its table-level locking (and neither InnoDB is - for mysql users). Overloading DB with needless subselect and seqscans doesn't play well with comparatively big tables.
    Also, why there's no foreign keys for itemid in history table? Why should zabbix control referential integrity, if it's database's job?
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    you might want to follow https://support.zabbix.com/browse/ZBX-1949 - it is highly likely that in 1.8.2 you will be able to remove limit completely.

    note that you are on your own if you do, deleting lots of hosts with lots of items will results lots of row removals
    Zabbix 3.0 Network Monitoring book

    Comment

    • EvilShadow
      Junior Member
      • Mar 2007
      • 13

      #3
      Originally posted by richlv
      it is highly likely that in 1.8.2 you will be able to remove limit completely.
      Great, thank you. But limit is not the problem. The problem is the way how system removes rows - by selecting set of oids in subquery and iterating over it in seqscan.
      Also I don't understand what does this subquery does:
      select oid from history where itemid=23511 limit 500
      Looks like it meant to remove oldest 500 entries. It will do so in mysql as mysql hits rows in order they were inserted. But in postgres it means 'any 500 rows', as SELECT without ORDER BY is not guaranteed to return rows in some particular order.

      Comment

      • untergeek
        Senior Member
        Zabbix Certified Specialist
        • Jun 2009
        • 512

        #4
        I'm no Oracle DBA, but I'd be willing to be that the same is true in most other RDBMSes. Does the Zabbix team have someone in-house optimizing queries for other platforms or is it all set to be just one-size-fits-all?

        Comment

        • richlv
          Senior Member
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Oct 2005
          • 3112

          #5
          Originally posted by EvilShadow
          Also I don't understand what does this subquery does:

          Looks like it meant to remove oldest 500 entries. It will do so in mysql as mysql hits rows in order they were inserted. But in postgres it means 'any 500 rows', as SELECT without ORDER BY is not guaranteed to return rows in some particular order.
          if it just removing 500 entries past data retention thresholds, i suppose ordering does not matter that much
          Zabbix 3.0 Network Monitoring book

          Comment

          • scalft
            Junior Member
            • Apr 2008
            • 12

            #6
            From what I remember back in the Zabbix 1.4 days, the housekeeper was once broken, and would not remove stale records. When this was repaired , the history table was VERY large with many thousands of records to be cleaned up. The housekeeper would take several days to clean this up if it just did a standard
            Code:
            delete from history where itemid=XXXX and date/time >YYYY
            (psuedo SQL). The limit logic was added so the housekeeper would actually finish in a reasonable amount of time. This slowly reduced the size of the history table, and left the database with sufficient performance to handle the rest of its tasks after the housekeeper finished.

            Comment

            Working...