Ad Widget

Collapse

zabbix-agent wont restart cleanly from logrotate.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • alj
    Senior Member
    • Aug 2006
    • 188

    #1

    zabbix-agent wont restart cleanly from logrotate.

    On some hosts zabbix sometimes dies during log rotation:

    Zabbig-agent dies almost every night with error message:
    016675:20061113:062613 Got signal. Exiting ...
    016679:20061113:062613 Got signal. Exiting ...
    016677:20061113:062613 Got signal. Exiting ...
    016673:20061113:062613 One child process died. Exiting ...
    015089:20061113:062613 zabbix_agentd started. ZABBIX 1.1.3.
    015089:20061113:062613 Cannot bind to port 10050. Error [Address already
    in use]
    . Another zabbix_agentd already running ?
    016676:20061113:062613 Got signal. Exiting ...
    016678:20061113:062613 Got signal. Exiting ...


    Another one:

    023650:20061110:062617 Got signal. Exiting ...
    023649:20061110:062617 Got signal. Exiting ...
    023648:20061110:062617 Got signal. Exiting ...
    021462:20061110:062617 zabbix_agentd started. ZABBIX 1.1.3.
    021462:20061110:062617 Cannot bind to port 10050. Error [Address already
    in use]
    . Another zabbix_agentd already running ?
    023652:20061110:062617 Got signal. Exiting ...
    022097:20061110:080959 zabbix_agentd started. ZABBIX 1.1.3.


    This particular zabbix agent came with etch distribution of Debian. I filed bug to debian but also want to post it here. I consider this issue to be serious as every morning i start with restarting of dead zabbix agents.
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    Please may I ask you to provide some more information about algorithm used by logrotate? What does it do exaclty with ZBABIX agent?
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • alj
      Senior Member
      • Aug 2006
      • 188

      #3
      Originally posted by Alexei
      Please may I ask you to provide some more information about algorithm used by logrotate? What does it do exaclty with ZBABIX agent?
      here's logrotate config that came with debian etch:

      $ cat /etc/logrotate.d/zabbix-agent
      Code:
      /var/log/zabbix-agent/zabbix_agentd.log {
          daily
          rotate 7
          compress
          missingok
          notifempty
          create 0640 zabbix zabbix
          sharedscripts
          postrotate
             if [ -f /var/run/zabbix-agent/zabbix_agentd.pid ]; then \
               if [ -x /usr/sbin/invoke-rc.d ]; then \
                 invoke-rc.d zabbix-agent restart > /dev/null; \
               else \
                 /etc/init.d/zabbix-agent restart > /dev/null; \
               fi; \
             fi;
          endscript
      }
      -----------------------------------------------
      I guess it renames log file then it forces zabbix-agent to restart so it can reopen log file (btw maybe it makes sense to use something like SIGUSR1 to reopen log file like apache does it si theres no service interrution).

      About logrotate logic and how it works please checklogrotate sources:
      http://packages.debian.org/testing/admin/logrotate (bottom of the page - sources with debian patch)

      Another missing piece here is debian zabbix stgartup script, here it is:
      ---------------------------------------------
      ~$ cat /etc/init.d/zabbix-agent
      Code:
      #! /bin/sh
      ### BEGIN INIT INFO
      # Provides:          zabbix-agent
      # Required-Start:    $local_fs $network
      # Required-Stop:     $local_fs
      # Default-Start:     S
      # Default-Stop:      0 6
      # Short-Description: Start zabbix-agent daemon
      ### END INIT INFO
      DAEMON=/usr/sbin/zabbix_agentd
      NAME=zabbix_agentd
      DESC="Zabbix agent"
      PID=/var/run/zabbix-agent/$NAME.pid
      
      test -f $DAEMON || exit 0
      
      set -e
      
      case "$1" in
        start)
              rm -f $PID
              echo "Starting $DESC: $NAME"
              start-stop-daemon --oknodo --start --pidfile $PID \
                      --exec $DAEMON
              ;;
        stop)
              echo "Stopping $DESC: $NAME"
              start-stop-daemon --oknodo --stop --exec $DAEMON
              ;;
        restart|force-reload)
              $0 stop
              $0 start
              ;;
        *)
              N=/etc/init.d/$NAME
              echo "Usage: $N {start|stop|restart|force-reload}" >&2
              exit 1
              ;;
      esac
      
      exit 0
      ---------------------------------------
      sources of "start-stop-daemon" are here (part of dpkg package):



      the platform i'm running this on is amd64, distribution - Debian-etch, no custom compiled stuff only standard packages.
      Debian bug filed about this problem but they say everything is fine on their side and something is within zabbix that does not let daemon be restarted cleanly.

      Debian bug number - 398405


      Thanks for looking into this.

      Comment

      • tapto
        Junior Member
        • Nov 2006
        • 28

        #4
        I have noticed some problems when restarting zabbix.

        I have to wait some time before starting again
        in my startup script i have

        restart)
        $0 stop
        sleep 60
        $0 start
        rc_status

        Whithout the wait I have had simmilar problem, zabbix already running in the logfile. I'm using SuSE but I guess the problem is the same.

        Comment

        • alj
          Senior Member
          • Aug 2006
          • 188

          #5
          Originally posted by tapto
          I have noticed some problems when restarting zabbix.

          I have to wait some time before starting again
          in my startup script i have

          restart)
          $0 stop
          sleep 60
          $0 start
          rc_status

          Whithout the wait I have had simmilar problem, zabbix already running in the logfile. I'm using SuSE but I guess the problem is the same.
          This is a dirty fix though :-)

          "sleep" does not guarantee you that application will exit (imagine you have overloaded machine) and it extends service interruption which can create zabbix alarm even.

          I'm not exactly sure how different linux distributions perform check that process exited before restarting it again, but besides this problem which has to be fixed somehow theres one feature that would help to keep service up - reload config files on SIGHUP and reopen log file on SIGUSR1 without service interruption. While first one is not very important the second one is pretty significant considering that log files will be rotated every night most likely.

          BTW the same applies to zabbix server. And i've seen it died once on me during log rotation.

          Comment

          • tapto
            Junior Member
            • Nov 2006
            • 28

            #6
            I totaly agree . It would be good if this could be fixed

            Not sure if its related to the way zabbix is started or stoped in the init.d script ?

            Not sure 60 is ok . I tend to use stop and then start manually as restart is unreliable.

            I have another anoying problem that could be related. When I use ssh to a host manually start the zabbix agent, exit.

            My session hangs

            If this problems could be fix it would be good. Any suggestions ?

            Comment

            • Alexei
              Founder, CEO
              Zabbix Certified Trainer
              Zabbix Certified SpecialistZabbix Certified Professional
              • Sep 2004
              • 5654

              #7
              Both problems (opened stderr and failed restart) will be addressed in 1.1.5.
              Alexei Vladishev
              Creator of Zabbix, Product manager
              New York | Tokyo | Riga
              My Twitter

              Comment

              • alj
                Senior Member
                • Aug 2006
                • 188

                #8
                Originally posted by Alexei
                Both problems (opened stderr and failed restart) will be addressed in 1.1.5.
                Спасибо :-)
                Alexei please dont forget to fix the same thing in zabbix-server.
                And thanks for the great product.

                Comment

                • Alexei
                  Founder, CEO
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Sep 2004
                  • 5654

                  #9
                  There must not be the postrotate option in the script! ZABBIX daemons automatically create new log file of one does not exist. No need to restart ZABBIX daemons every night!
                  Last edited by Alexei; 16-01-2007, 16:35.
                  Alexei Vladishev
                  Creator of Zabbix, Product manager
                  New York | Tokyo | Riga
                  My Twitter

                  Comment

                  Working...