Ad Widget

Collapse

zabbix_server won't start...

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Kees Jan Koster
    Member
    • Oct 2007
    • 83

    #1

    zabbix_server won't start...

    Dear All,

    My Zabbix server has died and will not restart. I increased the log level on the Zabbix server and here is what I got.

    This seems a little weird to me, since host 10052 is not a regular host, but it is a template. I applied that template to every host I monitor, so I am not willing to delete it and lose my history.

    *) How can I find out more about what is going wrong?

    *) What can I do to correct this issue?



    72655:20080313:100354 Query [select min(clock) from history_uint where itemid=10052]
    72655:20080313:100354 In delete_history(history_str,10052,51771494253650247 75,0)
    72655:20080313:100354 Query [select min(clock) from history_str where itemid=10052]
    72655:20080313:100354 In delete_history(history_text,10052,5177149425365024 775,0)
    72655:20080313:100354 Query [select min(clock) from history_text where itemid=10052]
    72655:20080313:100354 In delete_history(history_log,10052,51771494253650247 75,0)
    72655:20080313:100354 Query [select min(clock) from history_log where itemid=10052]
    72655:20080313:100354 In delete_history(trends,10052,5177149425365025133,0)
    72655:20080313:100354 Query [select min(clock) from trends where itemid=10052]
    72655:20080313:100354 In delete_history(history,10053,5177149425365024775,6 75013075)
    72655:20080313:100354 Query [select min(clock) from history where itemid=10053]
    72650:20080313:100354 One child process died. Exiting ...
    72652:20080313:100354 Got signal. Exiting ...
    72653:20080313:100354 Got signal. Exiting ...
    72655:20080313:100354 In delete_history(history_uint,10053,5177149425365024 775,0)
    72655:20080313:100354 Query [select min(clock) from history_uint where itemid=10053]
    72654:20080313:100354 Got signal. Exiting ...
    72655:20080313:100354 In delete_history(history_str,10053,51771494253650247 75,0)
    72655:20080313:100354 Query [select min(clock) from history_str where itemid=10053]
    72656:20080313:100354 Got signal. Exiting ...
    72655:20080313:100354 Got signal. Exiting ...
    72657:20080313:100354 Got signal. Exiting ...
    72658:20080313:100354 Got signal. Exiting ...
    72659:20080313:100354 Got signal. Exiting ...
    72660:20080313:100354 Got signal. Exiting ...
    72661:20080313:100354 Got signal. Exiting ...
    72662:20080313:100354 Got signal. Exiting ...
    72663:20080313:100354 Got signal. Exiting ...
    72664:20080313:100354 Got signal. Exiting ...
    72650:20080313:100356 ZABBIX Server stopped
  • Kees Jan Koster
    Member
    • Oct 2007
    • 83

    #2
    Well, I fixed it ... for some value of "fixed".

    I deleted all non-essential templates and hosts from the configuration and now the Zabbix server starts and runs properly.

    I'm not happy with the workaround, but I'm glad Zabbix starts again.

    If someone wants to debug this, I have a database dump available.

    Comment

    • schneck
      Member
      • May 2006
      • 62

      #3
      Patches for Debugging

      Originally posted by Kees Jan Koster
      Dear All,
      *) How can I find out more about what is going wrong?
      *) What can I do to correct this issue?
      ...
      72650:20080313:100354 One child process died. Exiting ...
      You might want to look at two patches I submitted last month:

      Patch 8975 will tell you which child died and why (which signal or exit code)

      Patch 8976 will allow you to set the current working directory to something different from /, so the zabbix server can drop a core dump and you will be able to do some post-mortem analysis.

      These patches helped me to find out why my server died in some instances ... maybe they will help you, too.

      Best regards,

      \B.

      Comment

      Working...