Hi,
while diagnosing some errors (one of the zabbix processes died), I needed to find out:
* WHICH process died?
* WHY did it die?
The new message format in the logfile adds
* the PID of the child process which died
* the exit code (if the process exited normally)
* the signal number (if the process died by a signal)
In this example, the child with PID 1234 died due to a SIGSEGV (segmentation violation)
If the PID or exit status can not be found out, the old message will be printed (and we're no wiser than before)
The following diff is with respect to the ``developers'' pre-zabbix-1.5.tar.gz from 2008-02-18.
Best regards,
\Bernhard.
while diagnosing some errors (one of the zabbix processes died), I needed to find out:
* WHICH process died?
* WHY did it die?
The new message format in the logfile adds
* the PID of the child process which died
* the exit code (if the process exited normally)
* the signal number (if the process died by a signal)
Code:
<PID-of-Master>:<Date>:<Time> One child process (1234) died with signal 11. Exiting ...
If the PID or exit status can not be found out, the old message will be printed (and we're no wiser than before)
The following diff is with respect to the ``developers'' pre-zabbix-1.5.tar.gz from 2008-02-18.
Best regards,
\Bernhard.
Code:
--- ./src/libs/zbxnix/daemon.c-dist Mon Feb 18 11:00:16 2008
+++ ./src/libs/zbxnix/daemon.c Wed Feb 20 18:31:55 2008
@@ -56,10 +56,22 @@
static void parent_signal_handler(int sig)
{
+ int pid;
+ int status;
+
switch(sig)
{
case SIGCHLD:
- zabbix_log( LOG_LEVEL_WARNING, "One child process died. Exiting ...");
+ if ((pid = wait4(WAIT_ANY, &status, WNOHANG, NULL)) < 0) {
+ zabbix_log( LOG_LEVEL_WARNING, "One child process died. Exiting ...");
+ } else {
+ zabbix_log( LOG_LEVEL_WARNING, "One child process (%d) %s with %s %d%s. Exiting ...",
+ pid,
+ WIFEXITED(status) ? "exited" : "died",
+ WIFEXITED(status) ? "exit code" : "signal",
+ WIFEXITED(status) ? WEXITSTATUS(status) : WTERMSIG(status),
+ WCOREDUMP(status) ? " (core dumped)" :"");
+ }
uninit();
exit( FAIL );
break;
Comment