Hello all.
Two days ago I've removed couple of hosts inherited from 'linux host' template, thus having hundred of items each. Next day I noticed poor database performance. On taking a look at pg_stat_activity i've noticed that zabbix still removes history for removed items. It does it this way:
Which is far from optimal, as can be seen:
I've had history table with 27 mln rows in it. Sequence scan in the table of 27e6 rows again and again for each item with subquery is really very bad idea. As can be seen, plain delete is much better:
And batch removal of old items is:
I understand that such deletion is made to not lock table by the long removal process. But Postgres is not MyISAM with its table-level locking (and neither InnoDB is - for mysql users). Overloading DB with needless subselect and seqscans doesn't play well with comparatively big tables.
Also, why there's no foreign keys for itemid in history table? Why should zabbix control referential integrity, if it's database's job?
Two days ago I've removed couple of hosts inherited from 'linux host' template, thus having hundred of items each. Next day I noticed poor database performance. On taking a look at pg_stat_activity i've noticed that zabbix still removes history for removed items. It does it this way:
delete from history where oid in (select oid from history where itemid=23511 limit 500);
zabbix=# explain delete from history where oid in (select oid from history where itemid=23511 limit 500); QUERY PLAN
---------------------------------------------------------------------------------------------------
Hash Semi Join (cost=1937.10..441472.82 rows=200 width=6)
Hash Cond: (public.history.oid = public.history.oid)
-> Seq Scan on history (cost=0.00..394651.26 rows=17095326 width=10)
-> Hash (cost=1930.85..1930.85 rows=500 width=4)
-> Limit (cost=0.00..1925.85 rows=500 width=4)
-> Index Scan using history_1 on history (cost=0.00..94748.08 rows=24599 width=4)
Index Cond: (itemid = 23511)
(7 rows)
---------------------------------------------------------------------------------------------------
Hash Semi Join (cost=1937.10..441472.82 rows=200 width=6)
Hash Cond: (public.history.oid = public.history.oid)
-> Seq Scan on history (cost=0.00..394651.26 rows=17095326 width=10)
-> Hash (cost=1930.85..1930.85 rows=500 width=4)
-> Limit (cost=0.00..1925.85 rows=500 width=4)
-> Index Scan using history_1 on history (cost=0.00..94748.08 rows=24599 width=4)
Index Cond: (itemid = 23511)
(7 rows)
zabbix=# explain delete from history where itemid=23511;
QUERY PLAN
------------------------------------------------------------------------------
Bitmap Heap Scan on history (cost=529.22..71520.79 rows=24599 width=6)
Recheck Cond: (itemid = 23511)
-> Bitmap Index Scan on history_1 (cost=0.00..523.07 rows=24599 width=0)
Index Cond: (itemid = 23511)
(4 rows)
QUERY PLAN
------------------------------------------------------------------------------
Bitmap Heap Scan on history (cost=529.22..71520.79 rows=24599 width=6)
Recheck Cond: (itemid = 23511)
-> Bitmap Index Scan on history_1 (cost=0.00..523.07 rows=24599 width=0)
Index Cond: (itemid = 23511)
(4 rows)
zabbix=# explain delete from history where itemid not in (select itemid from items);
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on history (cost=420.50..437810.08 rows=8547663 width=6)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on items (cost=0.00..417.60 rows=1160 width=8)
(4 rows)
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on history (cost=420.50..437810.08 rows=8547663 width=6)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on items (cost=0.00..417.60 rows=1160 width=8)
(4 rows)
Also, why there's no foreign keys for itemid in history table? Why should zabbix control referential integrity, if it's database's job?
, the history table was VERY large with many thousands of records to be cleaned up. The housekeeper would take several days to clean this up if it just did a standard
Comment