Ad Widget

Collapse

One child process died. Exiting

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • dreas
    Member
    • Aug 2007
    • 89

    #1

    One child process died. Exiting

    I am running the Zabbix server version 1:1.4.5-1. Sometimes it suddenly dies on me:
    -----
    669:20080528:145430 One child process died. Exiting ...
    669:20080528:145432 ZABBIX Server stopped
    -----

    This is a VPS (Xen) only used by Zabbix. How can I further troubleshoot this? When I start the server again things work fine. I really don't want my server to go offline randomly.
  • xs-
    Senior Member
    Zabbix Certified Specialist
    • Dec 2007
    • 393

    #2
    using mysql4?

    Comment

    • dreas
      Member
      • Aug 2007
      • 89

      #3
      No: 5.0.51a-6

      Comment

      • nelsonab
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2006
        • 1233

        #4
        Originally posted by dreas
        I am running the Zabbix server version 1:1.4.5-1. Sometimes it suddenly dies on me:
        -----
        669:20080528:145430 One child process died. Exiting ...
        669:20080528:145432 ZABBIX Server stopped
        -----

        This is a VPS (Xen) only used by Zabbix. How can I further troubleshoot this? When I start the server again things work fine. I really don't want my server to go offline randomly.
        The best I can say is "Good luck!" Unless you can get it to happen on a regular basis there's pretty much no easy way to figure this out. It would be nice if the error said something like, "One poller child died. Exiting..." or something along those lines.
        RHCE, author of zbxapi
        Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
        Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

        Comment

        • xs-
          Senior Member
          Zabbix Certified Specialist
          • Dec 2007
          • 393

          #5
          well, the only thing you can really do is use a higher debug level in the config file, so you can see what really happened (i suggest 3 if you do not have this already, 4 if you really want to debug, but dont forget to enlarge the LogFileSize parameter).

          Most of the cases its database related ('database has gone away' or some query problem), but i haven't seen anyone on the forum speak of running zabbix under Xen so this could popup a problem in a completely different area.

          Comment

          • dreas
            Member
            • Aug 2007
            • 89

            #6
            It's happening fairly regularly now (once every two days or so). I increased the debug level and I actually installed monit to monitor the Zabbix-Server process and auto start it if it died

            Comment

            • erozen
              Junior Member
              Zabbix Certified Specialist
              • Apr 2007
              • 18

              #7
              Do you run a multi-node environment? If so, is all you data up-to-date on all servers?

              I've encountered this a number of times, and usually find it's due to an item i've recently added, or somehting similar.

              Comment

              • dreas
                Member
                • Aug 2007
                • 89

                #8
                No I just have 1 central zabbix-server

                Comment

                • jamied66
                  Member
                  • Sep 2008
                  • 37

                  #9
                  I have a box doing the same thing.

                  I'm getting this from my debug output.

                  32200:20090427:155438 In check_security()
                  32200:20090427:155438 Requested [system.cpu.util[,system,avg1]]
                  32197:20090427:155438 One child process died. Exiting ...
                  32197:20090427:155438 zbx_on_exit() called.
                  32199:20090427:155438 Got signal. Exiting ...
                  32203:20090427:155438 Got signal. Exiting ...
                  32202:20090427:155438 Got signal. Exiting ...
                  32201:20090427:155438 Got signal. Exiting ...

                  this is running agent 1.5.4 on an older Linux system (rhel 2.4.21-58.ELsmp)

                  any ideas are appreciated.

                  Comment

                  • Cray
                    Member
                    • Mar 2009
                    • 72

                    #10
                    have you try upgrading to 1.6.4 ? as far as I know, 1.5.x versions are betas.

                    Comment

                    • jamied66
                      Member
                      • Sep 2008
                      • 37

                      #11
                      thanks.

                      that fixed it.

                      the 1.6.4 agent works with no issues.

                      Comment

                      • Vladimir
                        Junior Member
                        • May 2009
                        • 5

                        #12
                        I have same trouble

                        13139:20090520:123932 server #12 started [Alerter]
                        13140:20090520:123932 server #13 started [Housekeeper]
                        13140:20090520:123932 Executing housekeeper
                        13141:20090520:123932 server #14 started [Timer]
                        /libexec/ld-elf.so.1: /usr/local/lib/libnetsnmp.so.16: Undefined symbol "dmalloc_strndup"
                        13143:20090520:123932 server #16 started [Node watcher. Node ID:0]
                        13144:20090520:123932 server #17 started [HTTP Poller]
                        /libexec/ld-elf.so.1: /usr/local/lib/libnetsnmp.so.16: Undefined symbol "dmalloc_strndup"
                        13146:20090520:123932 server #19 started [Escalator]
                        13127:20090520:123932 One child process died. Exiting ...
                        13127:20090520:123934 ZABBIX Server stopped. ZABBIX 1.6.4.
                        It happends after update net-snmp to version 5.4.2.1_5

                        P.S.
                        FreeBSD 7.1-RELEASE
                        Installed ports:
                        zabbix-1.6.4,1
                        net-snmp-5.4.2.1_5
                        php5-snmp-5.2.9
                        php5-5.2.9
                        mysql-server-5.0.77_1

                        P.P.S. Sorry for my bad English

                        Comment

                        • dreas
                          Member
                          • Aug 2007
                          • 89

                          #13
                          I am still periodically seeing this as well (v1.4) when the monitoring server is experiencing high load.

                          Comment

                          • Vladimir
                            Junior Member
                            • May 2009
                            • 5

                            #14
                            in my case zabbix simply not running....

                            Comment

                            • nelsonab
                              Senior Member
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Sep 2006
                              • 1233

                              #15
                              @vladimir What version are you running? I saw this recently happen consistantly with 1.6.4. The trapper processes seemed to be very unstable and as soon as a remote client would try and send something... boom! Zabbix died. The only way I could fix it for now was to reduce the number of trappers to 0, which fortunately was not a big deal in this environment as we don't use them.
                              RHCE, author of zbxapi
                              Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
                              Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

                              Comment

                              Working...