Is there a log or anything that shows what the last thing done which killed the suckerd program? It just randomly stops..Any idea? I noticed this is happening when I add hosts.
Ad Widget
Collapse
Zabbix_suckerd randomly stops
Collapse
X
-
Tags: None
-
Well, think I solved that one. I noticed if you turn the monitor to unmonitored it keeps suckerd running even though I am running a simple check. once you add the host, turn it to monitor. I know you had to do this for an agent installed on a client, but wasn't sure for a simple http check, etc.
All is good again in the land of no idea. -
-
Anything in Logfile?Comment
-
Zabbix creates a log? which directory, didn't know it did
I have checked
/var/log/daemon.log
/var/log/dmesg
/var/log/messagesLast edited by thesaintjim; 23-11-2004, 21:41.Comment
-
well, i'll tell you this. I started suckerd again, went to the restroom, came back and it killed itself or somethingComment
-
Check /etc/zabbix/zabbix_suckerd.conf, parameter Logfile. Also, pay attention to DebugLevel.Comment
-
that is using debug level 2 above
002640:20041123:112627 Got QUIT or INT or TERM or PIPE signal. Exiting...
002642:20041123:112627 Got QUIT or INT or TERM or PIPE signal. Exiting...
002643:20041123:112627 Got QUIT or INT or TERM or PIPE signal. Exiting...
~
this is now using debug level 3
002695:20041123:112742 Starting zabbix_suckerd...
002697:20041123:112742 zabbix_suckerd #1 started [Alerter]
002698:20041123:112742 zabbix_suckerd #2 started [nodata() calculator]
002699:20041123:112742 zabbix_suckerd #3 started [ICMP pinger]
002695:20041123:112742 zabbix_suckerd #0 started [Housekeeper]
002705:20041123:112742 zabbix_suckerd #4 started [Sucker. SNMP:ON]
002695:20041123:113007 One child process died. Exiting ...
002697:20041123:113007 Got QUIT or INT or TERM or PIPE signal. Exiting...
002698:20041123:113007 Got QUIT or INT or TERM or PIPE signal. Exiting...
002699:20041123:113007 Got QUIT or INT or TERM or PIPE signal. Exiting...
waiting on it to shut off again..im sure it will when I return home from work. I'll show you debug 4 (just what is causing it to die)Comment
-
Nothing unusual..this is the last part of the log before it died
[}]
002965:20041123:114435 Macro:JimsComp:diskfree[c:].last(0)
002965:20041123:114435 Before find_char:JimsComp:diskfree[c:].last(0)[:]
002965:20041123:114435 Host:JimsComp
002965:20041123:114435 Before find_char:diskfree[c:].last(0)[.]
002965:20041123:114435 Key:diskfree[c:]
002965:20041123:114435 Before find_char:last(0)[(]
002965:20041123:114435 Function:last
002965:20041123:114435 Before find_char:0)[)]
002965:20041123:114435 Parameter:0
002965:20041123:114435 In get_lastvalue()
002965:20041123:114435 Executing query:select i.itemid,i.prevvalue,i.lastvalue,i.value_type,i.mu ltiplier,i.units from items i,hosts h where h.host='JimsComp' and h.hostid=i.hostid and i.key_='diskfree[c:]'
002965:20041123:114435 In DBnum_rows
002965:20041123:114435 Result of DBnum_rows [1]
002965:20041123:114435 Itemid:17210
002965:20041123:114435 Before evaluate_FUNCTION()
002965:20041123:114435 Function [last]
002965:20041123:114435 In evaluate_FUNCTION() 1
002965:20041123:114435 In evaluate_FUNCTION() 2
002958:20041123:114435 One child process died. Exiting ...
002960:20041123:114435 Got QUIT or INT or TERM or PIPE signal. Exiting...
002961:20041123:114435 Got QUIT or INT or TERM or PIPE signal. Exiting...
002964:20041123:114435 Got QUIT or INT or TERM or PIPE signal. Exiting...Comment
-
I took down the windows agent being monitored and now zabbix_suckerd doesnt shut off...more tests soonComment
-
Comment
-
No idea so far. I'll try to reproduce this problem.Comment
-
I think this might have something to do with triggers and email alerts. 1.1a2 was running fine for the whole time since it was released, until I added a trigger with email alert today. Now the same happens here, suckerd just exits after one child process has died.
As soon as I disable the trigger/emailalert suckerd runs fine again. I tested with both localhost as smtp server as well as a remote smtp.
and no email was ever sent, regardless of smtp configuration.. so either it dies before it even gets to sending the email, or somewhere within the email code.
I hope this helps you narrow down and solve the problem, as email alerts are somewhat critical to the whole network monitoring enterprise
update2: when I remove the email action from the trigger suckerd keeps on running fine, so at least in my case it's linked to the email action for the trigger.
with the email action this is the output from the log:
the trigger is a simple check for icmppingsec being greater than a set value.Code:032060:20041130:112450 Before find_char:last(0)[(] 032060:20041130:112450 Function:last 032060:20041130:112450 Before find_char:0)[)] 032060:20041130:112450 Parameter:0 032060:20041130:112450 In get_lastvalue() 032060:20041130:112450 Executing query:select i.itemid,i.prevvalue,i.lastvalue,i.value_type,i.multiplier,i.units from items i,hosts h where h.host='EVK_Router' and h.hostid=i.hostid and i.key_='icmppingsec' 032060:20041130:112450 In DBnum_rows 032060:20041130:112450 Result of DBnum_rows [1] 032060:20041130:112450 Itemid:17215 032060:20041130:112450 Before evaluate_FUNCTION() 032060:20041130:112450 Function [last] 032060:20041130:112450 In evaluate_FUNCTION() 1 032060:20041130:112450 In evaluate_FUNCTION() 2 032054:20041130:112450 One child process died. Exiting ... 032056:20041130:112450 Got QUIT or INT or TERM or PIPE signal. Exiting... 032058:20041130:112450 Got QUIT or INT or TERM or PIPE signal. Exiting... 032062:20041130:112450 Got QUIT or INT or TERM or PIPE signal. Exiting...
tom.Last edited by obstler; 30-11-2004, 12:40.Comment
-
I've added more debug printing. Please, get the latest include/functions.c from CVS and recompile everything.
Then run ZABBIX. If it crashes post debug output from LogFile here.
Thanks!Comment
Comment