PDA

View Full Version : 1.1alpha10 active check causes agent death


mvoss
10-07-2005, 05:03
I've read a few messages in the forum about how the clients will die when switching from non-active to active on any given item. That does seem to happen to me.

However I have a dilemma where they continue to die periodically, which is a major problem. Restarting the server and clients does not fix the problem.

Usually one of the children dies and it goes away shortly after puking on an active check. Many times the reporting of the active check shows a significant corruption (for example that is supposed to be system[hostname] not em[hostname] below in Example #1. At the same time as the client dying the server says "Can't ignore signal CHLD, forcing to default." Any ideas? I'm pretty lost.

Example #1:

000504:20050709:120203 Active check [em[hostname]] is not supported. Disabled.
000498:20050709:130329 One child process died. Exiting ...
000499:20050709:130329 Got signal. Exiting ...
000500:20050709:130329 Got signal. Exiting ...
000501:20050709:130329 Got signal. Exiting ...
000503:20050709:130329 Got signal. Exiting ...
000502:20050709:130329 Got signal. Exiting ...

Example #2:
009962:20050709:214928 In delete_all_metrics()
009962:20050709:214928 Parsed [diskfree[/logs]:60:0]
009962:20050709:214928 Key [diskfree[/logs]]
009962:20050709:214928 Refresh [60]
009962:20050709:214928 Lastlogsize [0]
009962:20050709:214928 In add check [diskfree[/logs]]
009962:20050709:214928 Parsed [0]
009962:20050709:214928 Key [0]
009962:20050709:214928 Refresh [(null)]
009962:20050709:214928 Lastlogsize [(null)]
009951:20050709:214928 One child process died. Exiting ...
009954:20050709:214928 Got signal. Exiting ...
009958:20050709:214928 Got signal. Exiting ...
009955:20050709:214928 Got signal. Exiting ...
009956:20050709:214928 Got signal. Exiting ...
009959:20050709:214928 Got signal. Exiting ...
009953:20050709:214928 Got signal. Exiting ...
009961:20050709:214928 Got signal. Exiting ...
009960:20050709:214928 Got signal. Exiting ...
009957:20050709:214928 Got signal. Exiting ...

Alexei
17-05-2006, 19:57
I do not think it is still the case with 1.1beta9.

lcondado
20-06-2006, 20:29
The problem remains in Zabbix 1.1

I am using a Perl script to manage notifications

here it's a clue about the problem:

http://support.bb4.com/archive/200306/msg00023.html