Ad Widget

Collapse

zabbix 1.8.2 dies

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ad@kbc-clearing.com
    Member
    • Sep 2005
    • 77

    #1

    zabbix 1.8.2 dies

    We just installed version 1.8.2. Looks great, except the following (last part of logfile with loglevel=4):

    insert into trends_uint (itemid,clock,num,value_min,value_avg,value_max) values (47662,1271358000,1,0,0,0);
    insert into trends_uint (itemid,clock,num,value_min,value_avg,value_max) values (47664,1271358000,2,250712,1311400,2372088);
    insert into trends_uint (itemid,clock,num,value_min,value_avg,value_max) values (47666,1271358000,1,1100672,1100672,1100672);
    ]
    6348:20100415:215128.991 End of DCflush_trends()
    6348:20100415:215128.991 Query [txnlev:1] [commit;]
    6348:20100415:215128.997 Syncing trends data...done.
    6348:20100415:215128.997 End of DCsync_trends()
    6348:20100415:215128.997 End of DCsync_all()
    6348:20100415:215128.997 End of free_database_cache()
    6348:20100415:215128.997 In free_configuration_cache()
    6348:20100415:215128.997 End of free_configuration_cache()
    6348:20100415:215128.998 In free_ipmi_handler()
    6348:20100415:215128.998 Zabbix Server stopped. Zabbix 1.8.2 (revision 11211).

    Everytime the trends are updated and flushed, the zabbix_server dies
  • bashman
    Senior Member
    • Dec 2009
    • 432

    #2
    Which zabbix_server OS distribution are you using?, please give more information.
    978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

    Comment

    • cybernijntje
      Junior Member
      • Sep 2008
      • 16

      #3
      Specs:

      XenServer 5.5 host with Debian Etch (4.0) VM with 4GB memory / 2 VCPUs


      Regards,
      Dennis
      KBC Clearing

      Comment

      • STux
        Junior Member
        • Apr 2010
        • 2

        #4
        Hello ,

        I also got the same problem !

        On the following config :
        Zabbix 1.8.2 (PostgreSQL)
        Dell Optiplex GX60 with Celeron 2Ghz, 512 MB RAM , OpenBSD 4.6.

        zabbix_server stops with exactly the same logs that cybernijntje posted.

        As far as i remember, it was the same few month ago with a previous installation of zabbix (1.6 based on debian).

        Comment

        • cybernijntje
          Junior Member
          • Sep 2008
          • 16

          #5
          Dirty workaround

          We do have a rather dirty workaround in place.

          We have a script which starts every 2 minutes: (root cronjob)
          */2 * * * * /scripts/zabbix_check_18.sh


          #/bin/bash
          TIMESTAMP=`date +%d-%m-%y_%H:%M:%S`
          OUTPUT=`cat /var/log/zabbix/zabbix_server.log | grep -i 'buffer is full' | wc -l`
          if [ $OUTPUT -gt 0 ] ; then
          echo "$TIMESTAMP Zabbix restart check buffer restarted ($OUTPUT)" >> /var/log/zabbix/zabbix_restart.log
          /usr/bin/killall -9 zabbix_server
          sleep 5
          mv -f /var/log/zabbix/zabbix_server.log /var/log/zabbix/zabbix_server.log.restarted
          /etc/init.d/zabbix_server start
          fi

          OUTPUT=`ps -ef | grep /etc/zabbix/bin/zabbix_server | wc -l`
          if [ $OUTPUT -lt 2 ] ; then
          echo "$TIMESTAMP Zabbix restart crash restarted ($OUTPUT)" >> /var/log/zabbix/zabbix_restart.log
          mv -f /var/log/zabbix/zabbix_server.log /var/log/zabbix/zabbix_server.log.restarted
          /etc/init.d/zabbix_server start
          fi

          OUTPUT=`ps -ef | grep zabbix_server | grep defunct | wc -l`
          if [ $OUTPUT -gt 0 ] ; then
          echo "$TIMESTAMP Zabbix restart defunct restarted ($OUTPUT)" >> /var/log/zabbix/zabbix_restart.log
          /usr/bin/killall -9 zabbix_server
          sleep 5
          mv -f /var/log/zabbix/zabbix_server.log /var/log/zabbix/zabbix_server.log.restarted
          /etc/init.d/zabbix_server start
          fi


          With this script a trigger is also created in Zabbix


          A less dirty fix would be nice though
          Last edited by cybernijntje; 27-04-2010, 10:45.

          Comment

          • bashman
            Senior Member
            • Dec 2009
            • 432

            #6
            You could use the daemontools for starting zabbix_server when it dies.
            978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

            Comment

            • STux
              Junior Member
              • Apr 2010
              • 2

              #7
              I made the following script for openbsd :

              Code:
              #!/bin/sh
              
              ZABBIX_PROCESS_COUNT=`ps aux|grep zabbix_server|wc -l|tr -d " "`
              ZABBIX_IPCS_COUNT=`ipcs|grep "^s"|grep zabbix|wc -l`
              TIMESTAMP=`date +%y%m%d %H:%M:%S`
              
              if [ $ZABBIX_PROCESS_COUNT -eq 0 ]; then
                      if [ $ZABBIX_IPCS_COUNT -eq 0 ]; then
                              IPCRM=`ipcs|grep "^s"|grep zabbix|awk '{ print $2 }'`
                              ipcrm -s $IPCRM
                      fi;
                      sleep 2
                      echo "$TIMESTAMP : zabbix_server is stopped , running zabbix_server"
                      /usr/local/sbin/zabbix_server
                      exit 0
              else
                      echo "$TIMESTAMP : zabbix_server is already running"
                      exit 1
              fi
              
              exit 0
              it is launched every 5 minutes by crontab.

              Comment

              • bashman
                Senior Member
                • Dec 2009
                • 432

                #8
                Originally posted by bashman
                You could use the daemontools for starting zabbix_server when it dies.
                You can install the daemontools and use supervise to monitor a script. You'll need a directory (supervise) and a script (run) to launch zabbix_server with a loop so that run will be alive while zabbix_server is up, when zabbix_server dies supervise will launch run again.

                Code:
                mkdir /path/to/your/supervise
                vi /path/to/your/run 
                
                #!/bin/sh
                set -e
                export PATH="${PATH:+$PATH:}/usr/sbin:/sbin"
                #Wait until port 10051 is free
                echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Restarting: waiting for port 10051" >> /path/to/your/log/active_pid.log
                while [ 1 -eq `netstat -nl | grep 10051 |wc -l` ]; do sleep 10; done
                /usr/local/sbin/zabbix_server
                sleep 2
                pid=`cat /var/run/zabbix-server/zabbix_server.pid`
                echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Restarted pid:" >> /path/to/your/log/active_pid.log
                echo $pid >> /path/to/your/log/active_pid.log
                while [ -e /proc/$pid ]; do sleep 10; done
                Once supervise is launched you can use svc and svstat utility:

                To stop supervise (but not the run script):
                Code:
                svc -d /path/to/your &
                To run supervise (but not the run script):

                Code:
                svc -u /path/to/your &
                To exit and stop supervise and run script:

                Code:
                svc -x /path/to/your &
                To know the state:

                Code:
                svstat /path/to/your
                (/path/to/your is where supervise directory resides)

                These are the options:

                Code:
                    *
                      -u: Up. If the service is not running, start it. If the service stops, restart it.
                    *
                      -d: Down. If the service is running, send it a TERM signal and then a CONT signal. After it stops, do not restart it.
                    *
                      -o: Once. If the service is not running, start it. Do not restart it if it stops.
                    *
                      -p: Pause. Send the service a STOP signal.
                    *
                      -c: Continue. Send the service a CONT signal.
                    *
                      -h: Hangup. Send the service a HUP signal.
                    *
                      -a: Alarm. Send the service an ALRM signal.
                    *
                      -i: Interrupt. Send the service an INT signal.
                    *
                      -t: Terminate. Send the service a TERM signal.
                    *
                      -k: Kill. Send the service a KILL signal.
                    *
                      -x: Exit. supervise will exit as soon as the service is down. If you use this option on a stable system, you're doing something wrong; supervise is designed to run forever.
                You can modify your start up script (/etc/init.d/zabbix-server):

                Code:
                case "$1" in
                start)
                  rm -f $PID
                      echo "Starting $DESC: $NAME" 
                      echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Controlled start." >> /path/to/your/log/active_pid.log
                      supervise /path/to/your &
                      #       start-stop-daemon --oknodo --start --pidfile $PID \
                      #               --exec $DAEMON >/dev/null 2>&1
                      ;;
                stop)
                      echo "Stopping $DESC: $NAME"
                      svc -x /path/to/your &
                      kill `tail -1 /path/to/your/log/active_pid.log`
                      echo "["`date "+%d-%m-%Y %H:%M:%S"`"]: Controlled stop." >> /path/to/your/log/active_pid.log
                      #       start-stop-daemon --oknodo --stop --pidfile $PID \
                      #               --exec $DAEMON
                978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

                Comment

                Working...