1.1alpha10 active check causes agent death

mvoss

Junior Member

Joined: Feb 2005

Posts: 9
#1

1.1alpha10 active check causes agent death

10-07-2005, 05:03

I've read a few messages in the forum about how the clients will die when switching from non-active to active on any given item. That does seem to happen to me.

However I have a dilemma where they continue to die periodically, which is a major problem. Restarting the server and clients does not fix the problem.

Usually one of the children dies and it goes away shortly after puking on an active check. Many times the reporting of the active check shows a significant corruption (for example that is supposed to be system[hostname] not em[hostname] below in Example #1. At the same time as the client dying the server says "Can't ignore signal CHLD, forcing to default." Any ideas? I'm pretty lost.

Example #1:

000504:20050709:120203 Active check [em[hostname]] is not supported. Disabled.
000498:20050709:130329 One child process died. Exiting ...
000499:20050709:130329 Got signal. Exiting ...
000500:20050709:130329 Got signal. Exiting ...
000501:20050709:130329 Got signal. Exiting ...
000503:20050709:130329 Got signal. Exiting ...
000502:20050709:130329 Got signal. Exiting ...

Example #2:
009962:20050709:214928 In delete_all_metrics()
009962:20050709:214928 Parsed [diskfree[/logs]:60:0]
009962:20050709:214928 Key [diskfree[/logs]]
009962:20050709:214928 Refresh [60]
009962:20050709:214928 Lastlogsize [0]
009962:20050709:214928 In add check [diskfree[/logs]]
009962:20050709:214928 Parsed [0]
009962:20050709:214928 Key [0]
009962:20050709:214928 Refresh [(null)]
009962:20050709:214928 Lastlogsize [(null)]
009951:20050709:214928 One child process died. Exiting ...
009954:20050709:214928 Got signal. Exiting ...
009958:20050709:214928 Got signal. Exiting ...
009955:20050709:214928 Got signal. Exiting ...
009956:20050709:214928 Got signal. Exiting ...
009959:20050709:214928 Got signal. Exiting ...
009953:20050709:214928 Got signal. Exiting ...
009961:20050709:214928 Got signal. Exiting ...
009960:20050709:214928 Got signal. Exiting ...
009957:20050709:214928 Got signal. Exiting ...
Tags: None
Alexei

Founder, CEO

Joined: Sep 2004

Posts: 5654
#2

17-05-2006, 19:57

I do not think it is still the case with 1.1beta9.

Alexei Vladishev
Creator of Zabbix, Product manager
New York | Tokyo | Riga
My Twitter
Comment
lcondado

Member

Joined: May 2006

Posts: 37
#3

20-06-2006, 20:29

Clue

The problem remains in Zabbix 1.1

I am using a Perl script to manage notifications

here it's a clue about the problem:

http://support.bb4.com/archive/200306/msg00023.html

Last edited by lcondado; 20-06-2006, 20:33.
Comment

Ad Widget

1.1alpha10 active check causes agent death

1.1alpha10 active check causes agent death

Comment

Comment