PDA

View Full Version : Housekeeper isn't cleaning


cperera
24-09-2007, 17:01
Hi folks,

From what I gather the housekeeper process doesn't appear to be cleaning old history.

For example let's take a look at an item with itemid 100000000019023

mysql> select itemid,type,delay,history,trends from items where itemid = 100000000019023\G
*************************** 1. row ***************************
itemid: 100000000019023
type: 0
delay: 30
history: 7
trends: 365


So if I understand correctly we should have at most data for seven days collected twice a minute, which is 7*24*60*2 = 20160

But we have a lot more than that:
mysql> select count(*) from history_uint where itemid = 100000000019023\G
*************************** 1. row ***************************
count(*): 165281


Comparing the value of clock with the system time:
> date +%s
1190645639

mysql> select * from history_uint where itemid = 100000000019023 order by clock limit 1\G
*************************** 1. row ***************************
itemid: 100000000019023
clock: 1185032586
value: 35143680

Which turns out to be 64 days
(1190645639-1185032586)/(60*60*24)
64

Server log shows that the housekeeper did run
grep -i house zabbix_server.log
20320:20070924:094722 server #57 started [Housekeeper]
20320:20070924:094722 Executing housekeeper
20320:20070924:104845 Executing housekeeper


Am I missing something?


- Charith

Alexei
24-09-2007, 17:06
The housekeeper does not delete all outdated data in one go. Do you run 1.4.2, btw?

cperera
24-09-2007, 17:13
Hi Alexi,

Yes we're running 1.4.2:
ZABBIX Server (daemon) v1.4.2 (20 August 2007)

We've been running this version shortly after it was released. How can we verify that it's cleaning any data or if it's encountering some sort of error?



- Charith.

Alexei
24-09-2007, 17:29
How can we verify that it's cleaning any data or if it's encountering some sort of error?
Just execute it now:

select count(*) from history_uint where itemid = 100000000019023


and tomorrow (or after 5 hours) and compare results.

cperera
24-09-2007, 18:38
Will do. Thanks.



- Charith

cperera
25-09-2007, 17:52
Alexi,

Looks like nothing was removed since we updated this thread yesterday

mysql> select count(*) from history_uint where itemid = 100000000019023\G
*************************** 1. row ***************************
count(*): 168049


mysql> select * from history_uint where itemid = 100000000019023 order by clock limit 1\G
*************************** 1. row ***************************
itemid: 100000000019023
clock: 1185032586
value: 35143680


The housekeeper has been running diligently
grep -i house zabbix_server.log
20320:20070924:094722 server #57 started [Housekeeper]
20320:20070924:094722 Executing housekeeper
20320:20070924:104845 Executing housekeeper
..
..
..
20320:20070925:081037 Executing housekeeper
20320:20070925:091139 Executing housekeeper
20320:20070925:101242 Executing housekeeper
20320:20070925:111345 Executing housekeeper

cperera
25-09-2007, 17:56
I suppose this is more helpful, but I'm not sure why it's happenning:

20320:20070924:230114 Executing housekeeper
20320:20070924:230116 Deleted 0 records from history and trends



- Charith

Calimero
25-09-2007, 18:00
What if you run Zabbix Server with debug level logging ?
You'll have all SQL queries that Zabbix executes.

cperera
25-09-2007, 18:12
What if you run Zabbix Server with debug level logging ?
You'll have all SQL queries that Zabbix executes.

Trying that right now, talk about verbose! :)

Alexei
25-09-2007, 18:59
So if I understand correctly we should have at most data for seven days collected twice a minute, which is 7*24*60*2 = 20160

It actually depends on your item-level settings though. What is configuration of item with itemid=100000000019023? Are you sure 'Keep history' is set to 7 days?

cperera
25-09-2007, 19:20
It actually depends on your item-level settings though. What is configuration of item with itemid=100000000019023? Are you sure 'Keep history' is set to 7 days?

Yes. See below.

mysql> select itemid,delay,history,trends from items where itemid = 100000000019023\G
*************************** 1. row ***************************
itemid: 100000000019023
delay: 30
history: 7
trends: 365

cperera
25-09-2007, 21:47
I checked the 2nd installation of Zabbix that we have and housekeeping seems to be working fine on that node:

10688:20070924:220723 Executing housekeeper
10688:20070924:220745 Deleted 447243 records from history and trends

Only differences in the zabbix_server.conf is the nodeids (2nd installation is running as nodeid 2 because we tried to link them previously) and the number of StartPollers and StartTrappers.

The other obvious difference is the housekeeper table. Node 1 (where it doesn't work):
mysql> select count(*) from housekeeper\G
*************************** 1. row ***************************
count(*): 3626

Whereas on node 2:
mysql> select count(*) from housekeeper\G
*************************** 1. row ***************************
count(*): 0


Looking at the contents of the housekeeper table:
mysql> select * from housekeeper where value = 100000000023930;
+-----------------+--------------+--------+-----------------+
| housekeeperid | tablename | field | value |
+-----------------+--------------+--------+-----------------+
| 100000000003623 | trends | itemid | 100000000023930 |
| 100000000003624 | history_log | itemid | 100000000023930 |
| 100000000003625 | history_uint | itemid | 100000000023930 |
| 100000000003626 | history_str | itemid | 100000000023930 |
| 100000000003627 | history | itemid | 100000000023930 |
+-----------------+--------------+--------+-----------------+


None of the tables mentioned contain an item with that itemid.

cperera
28-09-2007, 22:57
I think we may have solved our problem. After upgrading to 1.4.2 the binaries were placed in /sbin/ but the init script was still looking at /bin/, so we were in fact running 1.4 and 1.4.1 in our two environments :) That's fixed now, and I see some housekeeping being done, slowly.