Ad Widget

**bashman** · 22-04-2010, 09:02

Which zabbix_server OS distribution are you using?, please give more information.

**cybernijntje** · 22-04-2010, 09:15

Specs:

XenServer 5.5 host with Debian Etch (4.0) VM with 4GB memory / 2 VCPUs

Regards,
Dennis
KBC Clearing

**STux** · 26-04-2010, 17:37

Hello ,

I also got the same problem !

On the following config :
Zabbix 1.8.2 (PostgreSQL)
Dell Optiplex GX60 with Celeron 2Ghz, 512 MB RAM , OpenBSD 4.6.

zabbix_server stops with exactly the same logs that cybernijntje posted.

As far as i remember, it was the same few month ago with a previous installation of zabbix (1.6 based on debian).

**cybernijntje** · 27-04-2010, 10:43

Dirty workaround

We do have a rather dirty workaround in place.

We have a script which starts every 2 minutes: (root cronjob)
*/2 * * * * /scripts/zabbix_check_18.sh

#/bin/bash
TIMESTAMP=`date +%d-%m-%y_%H:%M:%S`
OUTPUT=`cat /var/log/zabbix/zabbix_server.log | grep -i 'buffer is full' | wc -l`
if [ $OUTPUT -gt 0 ] ; then
echo "$TIMESTAMP Zabbix restart check buffer restarted ($OUTPUT)" >> /var/log/zabbix/zabbix_restart.log
/usr/bin/killall -9 zabbix_server
sleep 5
mv -f /var/log/zabbix/zabbix_server.log /var/log/zabbix/zabbix_server.log.restarted
/etc/init.d/zabbix_server start
fi

OUTPUT=`ps -ef | grep /etc/zabbix/bin/zabbix_server | wc -l`
if [ $OUTPUT -lt 2 ] ; then
echo "$TIMESTAMP Zabbix restart crash restarted ($OUTPUT)" >> /var/log/zabbix/zabbix_restart.log
mv -f /var/log/zabbix/zabbix_server.log /var/log/zabbix/zabbix_server.log.restarted
/etc/init.d/zabbix_server start
fi

OUTPUT=`ps -ef | grep zabbix_server | grep defunct | wc -l`
if [ $OUTPUT -gt 0 ] ; then
echo "$TIMESTAMP Zabbix restart defunct restarted ($OUTPUT)" >> /var/log/zabbix/zabbix_restart.log
/usr/bin/killall -9 zabbix_server
sleep 5
mv -f /var/log/zabbix/zabbix_server.log /var/log/zabbix/zabbix_server.log.restarted
/etc/init.d/zabbix_server start
fi

With this script a trigger is also created in Zabbix

A less dirty fix would be nice though

**bashman** · 27-04-2010, 11:36

You could use the daemontools for starting zabbix_server when it dies.

**STux** · 27-04-2010, 14:40

I made the following script for openbsd :

Code:

#!/bin/sh

ZABBIX_PROCESS_COUNT=`ps aux|grep zabbix_server|wc -l|tr -d " "`
ZABBIX_IPCS_COUNT=`ipcs|grep "^s"|grep zabbix|wc -l`
TIMESTAMP=`date +%y%m%d %H:%M:%S`

if [ $ZABBIX_PROCESS_COUNT -eq 0 ]; then
        if [ $ZABBIX_IPCS_COUNT -eq 0 ]; then
                IPCRM=`ipcs|grep "^s"|grep zabbix|awk '{ print $2 }'`
                ipcrm -s $IPCRM
        fi;
        sleep 2
        echo "$TIMESTAMP : zabbix_server is stopped , running zabbix_server"
        /usr/local/sbin/zabbix_server
        exit 0
else
        echo "$TIMESTAMP : zabbix_server is already running"
        exit 1
fi

exit 0

it is launched every 5 minutes by crontab.

**bashman** · 28-04-2010, 09:05

Originally posted by bashman

You could use the daemontools for starting zabbix_server when it dies.

You can install the daemontools and use supervise to monitor a script. You'll need a directory (supervise) and a script (run) to launch zabbix_server with a loop so that run will be alive while zabbix_server is up, when zabbix_server dies supervise will launch run again.

Code:

mkdir /path/to/your/supervise
vi /path/to/your/run 

#!/bin/sh
set -e
export PATH="${PATH:+$PATH:}/usr/sbin:/sbin"
#Wait until port 10051 is free
echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Restarting: waiting for port 10051" >> /path/to/your/log/active_pid.log
while [ 1 -eq `netstat -nl | grep 10051 |wc -l` ]; do sleep 10; done
/usr/local/sbin/zabbix_server
sleep 2
pid=`cat /var/run/zabbix-server/zabbix_server.pid`
echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Restarted pid:" >> /path/to/your/log/active_pid.log
echo $pid >> /path/to/your/log/active_pid.log
while [ -e /proc/$pid ]; do sleep 10; done

Once supervise is launched you can use svc and svstat utility:

To stop supervise (but not the run script):

Code:

svc -d /path/to/your &

To run supervise (but not the run script):

Code:

svc -u /path/to/your &

To exit and stop supervise and run script:

Code:

svc -x /path/to/your &

To know the state:

Code:

svstat /path/to/your

(/path/to/your is where supervise directory resides)

These are the options:

Code:

    *
      -u: Up. If the service is not running, start it. If the service stops, restart it.
    *
      -d: Down. If the service is running, send it a TERM signal and then a CONT signal. After it stops, do not restart it.
    *
      -o: Once. If the service is not running, start it. Do not restart it if it stops.
    *
      -p: Pause. Send the service a STOP signal.
    *
      -c: Continue. Send the service a CONT signal.
    *
      -h: Hangup. Send the service a HUP signal.
    *
      -a: Alarm. Send the service an ALRM signal.
    *
      -i: Interrupt. Send the service an INT signal.
    *
      -t: Terminate. Send the service a TERM signal.
    *
      -k: Kill. Send the service a KILL signal.
    *
      -x: Exit. supervise will exit as soon as the service is down. If you use this option on a stable system, you're doing something wrong; supervise is designed to run forever.

You can modify your start up script (/etc/init.d/zabbix-server):

Code:

case "$1" in
start)
  rm -f $PID
      echo "Starting $DESC: $NAME" 
      echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Controlled start." >> /path/to/your/log/active_pid.log
      supervise /path/to/your &
      #       start-stop-daemon --oknodo --start --pidfile $PID \
      #               --exec $DAEMON >/dev/null 2>&1
      ;;
stop)
      echo "Stopping $DESC: $NAME"
      svc -x /path/to/your &
      kill `tail -1 /path/to/your/log/active_pid.log`
      echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Controlled stop." >> /path/to/your/log/active_pid.log
      #       start-stop-daemon --oknodo --stop --pidfile $PID \
      #               --exec $DAEMON

Ad Widget

zabbix 1.8.2 dies

zabbix 1.8.2 dies

Comment

Comment

Comment

Comment

Comment

Comment

Comment