Ad Widget

Collapse

zabbix_server (1.1 alpha 5) fails to start up

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Mark Ramm-Christensen
    Junior Member
    • Oct 2004
    • 14

    #1

    zabbix_server (1.1 alpha 5) fails to start up

    I am consistently having trouble getting the zabbix_server agent to start.

    If the server fails to start, no zabbix_server.pid file seems to have ever been created. The error log shows only standard server startup stuff, and then after starting one of the trapper threads, I get "One child process died. Exiting"

    Every time I start the server it fails. But I can always get it to work by trying to start the zabbix_server several times in quick succession.

    When I do this I get a zabbix_server.pid file with an incorrect pid -- but everything works.

    If it would be helpful I can provide level 4 log files, although they don't seem to have anything particularly useful.

    --Mark Ramm
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    I've never heard of such situations.

    Do the following:

    1. killall zabbix_server
    2. remove zabbix_server.pid
    3. start zabbix_server

    Does the server work?
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • Mark Ramm-Christensen
      Junior Member
      • Oct 2004
      • 14

      #3
      Originally posted by Alexei
      I've never heard of such situations.

      Do the following:

      1. killall zabbix_server
      2. remove zabbix_server.pid
      3. start zabbix_server

      Does the server work?
      I've tried this multiple times. When I kill the processes, the pid file is automatically removed. Then I start the server and after a second or so, I check the logs, I get the "one child died" error and everything shuts down. The shutdown shows an error that no zabbix_server.pid file exists.

      I verify that the zabbix_server.pid file does not exist, and try restarting the server. When I check the logs again, I get the same thing. A ps -u zabbix shows nothing (I am not running the agent on this machine).

      However, if I type /opt/zabbix/zabbix_server two or three times very quickly the zabbix_server.pid file shows up in /tmp and everything runs. However the number contained in the PID file is 20 or 30 below the actual PID of the last zabbix_server process.

      I reciently installed fping, and updated the zabbix_server conf file with the location of fping, and a new pid location (to match the init.d configuration script). But I don't see how either of those things could be the cause of this problem.

      --Mark



      Here is some some information that might be helpful.

      This time I tried starting zabbix_server, looking at the log and then tried again two more times.

      First the zabbix_server log

      003633:20050209:151702 Starting zabbix_server...
      003635:20050209:151702 #server 1 started [Alerter]
      003636:20050209:151702 server #2 started [nodata() calculator]
      003637:20050209:151702 server #3 started [ICMP pinger]
      003639:20050209:151702 server #5 started [Trapper]
      003640:20050209:151702 server #6 started [Trapper]
      003641:20050209:151702 server #7 started [Trapper]
      003633:20050209:151702 One child process died. Exiting ...
      003633:20050209:151702 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003635:20050209:151702 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003636:20050209:151702 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003637:20050209:151702 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003639:20050209:151702 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003639:20050209:151702 Cannot remove PID file [/tmp/zabbix_server.pid] [No such file or directory]
      003638:20050209:151702 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003748:20050209:151750 Starting zabbix_server...
      003750:20050209:151750 #server 1 started [Alerter]
      003752:20050209:151750 server #2 started [nodata() calculator]
      003754:20050209:151750 server #3 started [ICMP pinger]
      003757:20050209:151750 server #5 started [Trapper]
      003759:20050209:151750 server #6 started [Trapper]
      003748:20050209:151750 One child process died. Exiting ...
      003750:20050209:151750 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003752:20050209:151750 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003748:20050209:151750 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003754:20050209:151750 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003756:20050209:151750 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003757:20050209:151750 Got QUIT or INT or TERM or PIPE signal. Exiting...
      003757:20050209:151750 Cannot remove PID file [/tmp/zabbix_server.pid] [No such file or directory]
      003766:20050209:151755 Starting zabbix_server...
      003768:20050209:151755 #server 1 started [Alerter]
      003770:20050209:151755 server #2 started [nodata() calculator]
      003772:20050209:151755 server #3 started [ICMP pinger]
      003776:20050209:151755 server #5 started [Trapper]
      003775:20050209:151755 server #4 started [Sucker. SNMP:ON]
      003778:20050209:151756 server #6 started [Trapper]
      003779:20050209:151755 server #7 started [Trapper]
      The zabbix PID file contains just one line:

      3766
      and the results of a ps -u zabbix are:

      PID TTY TIME CMD
      3768 ? 00:00:00 zabbix_server
      3770 ? 00:00:00 zabbix_server
      3772 ? 00:00:00 zabbix_server
      3775 ? 00:00:00 zabbix_server
      3776 ? 00:00:00 zabbix_server
      Last edited by Mark Ramm-Christensen; 09-02-2005, 22:42. Reason: More detailed information

      Comment

      • Kayou
        Junior Member
        • Jan 2005
        • 17

        #4
        I had this problem once too ad hereis what happened :

        zabbix_server process will not run under root user,

        You need to make it run under zabbix user for example (just create a user called zabbix or whatever). Then you need to modify the startup script so the process is launched with that user and not with root user.

        On my example the startup script had an error in the parameters given to the launcher and by default the process was trying to start under root user wich is not supported by zabbix_server process.

        I m running a suse 9.2 and the line is that is launchng the process looks like :

        startproc -u zabbix -p ${ZABBIX_PID} ${ZABBIX_BIN}

        And in the original file i had the -u option at the end of the line wich couldnt work.

        Hope this Help.

        Kayou

        Comment

        • petkovsc
          Junior Member
          • Feb 2005
          • 6

          #5
          What does strace/truss report? Make sure zabbix_server is not running. Then run it once as follows:

          Linux:
          strace -f -o trace.log /opt/zabbix/bin/zabbix_server

          BSD/Solaris (if strace isn't installed):
          truss -f -o trace.log /opt/zabbix/bin/zabbix_server

          Upload or post trace.log. Install strace or truss if your system doesn't have it. Strace is available for all linux distributions.

          However, if I type /opt/zabbix/zabbix_server two or three times very quickly the zabbix_server.pid file shows up in /tmp and everything runs. However the number contained in the PID file is 20 or 30 below the actual PID of the last zabbix_server process.
          When you do this, each time you are telling zabbix_server to daemonize itself. I don't know if zabbix_server is set to look for itself in memory before initializing. Likely one instance is grabbing the pid file and writing to it. One instance meets the specific conditions you apparently need for it to continue running. However the two instances are not the same.

          Comment

          • Alexei
            Founder, CEO
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Sep 2004
            • 5654

            #6
            Hi Mark,

            I think I found cause of the problem.

            Here is quick fix. In file server.c:

            // Replace!
            //pids=calloc(CONFIG_SUCKERD_FORKS-1,sizeof(pid_t));
            pids=calloc(CONFIG_SUCKERD_FORKS+CONFIG_TRAPPERD_F ORKS-1,sizeof(pid_t));

            ....

            // Comment this line

            // pids = calloc(CONFIG_TRAPPERD_FORKS, sizeof(pid_t));

            Recompile zabbix_server and restart it. Let me know if it works.

            Thanks for your report!
            Alexei Vladishev
            Creator of Zabbix, Product manager
            New York | Tokyo | Riga
            My Twitter

            Comment

            • Mark Ramm-Christensen
              Junior Member
              • Oct 2004
              • 14

              #7
              Originally posted by Alexei
              Hi Mark,

              I think I found cause of the problem ...Recompile zabbix_server and restart it. Let me know if it works.

              Thanks for your report!
              So far so good!

              --Mark

              Comment

              • Lovespider
                Member
                • Sep 2004
                • 99

                #8
                Works also for me...thank you Alexei.

                Comment

                Working...