Ad Widget

Collapse

1.1alpha10 active check causes agent death

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mvoss
    Junior Member
    • Feb 2005
    • 9

    #1

    1.1alpha10 active check causes agent death

    I've read a few messages in the forum about how the clients will die when switching from non-active to active on any given item. That does seem to happen to me.

    However I have a dilemma where they continue to die periodically, which is a major problem. Restarting the server and clients does not fix the problem.

    Usually one of the children dies and it goes away shortly after puking on an active check. Many times the reporting of the active check shows a significant corruption (for example that is supposed to be system[hostname] not em[hostname] below in Example #1. At the same time as the client dying the server says "Can't ignore signal CHLD, forcing to default." Any ideas? I'm pretty lost.

    Example #1:

    000504:20050709:120203 Active check [em[hostname]] is not supported. Disabled.
    000498:20050709:130329 One child process died. Exiting ...
    000499:20050709:130329 Got signal. Exiting ...
    000500:20050709:130329 Got signal. Exiting ...
    000501:20050709:130329 Got signal. Exiting ...
    000503:20050709:130329 Got signal. Exiting ...
    000502:20050709:130329 Got signal. Exiting ...

    Example #2:
    009962:20050709:214928 In delete_all_metrics()
    009962:20050709:214928 Parsed [diskfree[/logs]:60:0]
    009962:20050709:214928 Key [diskfree[/logs]]
    009962:20050709:214928 Refresh [60]
    009962:20050709:214928 Lastlogsize [0]
    009962:20050709:214928 In add check [diskfree[/logs]]
    009962:20050709:214928 Parsed [0]
    009962:20050709:214928 Key [0]
    009962:20050709:214928 Refresh [(null)]
    009962:20050709:214928 Lastlogsize [(null)]
    009951:20050709:214928 One child process died. Exiting ...
    009954:20050709:214928 Got signal. Exiting ...
    009958:20050709:214928 Got signal. Exiting ...
    009955:20050709:214928 Got signal. Exiting ...
    009956:20050709:214928 Got signal. Exiting ...
    009959:20050709:214928 Got signal. Exiting ...
    009953:20050709:214928 Got signal. Exiting ...
    009961:20050709:214928 Got signal. Exiting ...
    009960:20050709:214928 Got signal. Exiting ...
    009957:20050709:214928 Got signal. Exiting ...
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    I do not think it is still the case with 1.1beta9.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • lcondado
      Member
      • May 2006
      • 37

      #3
      Clue

      The problem remains in Zabbix 1.1

      I am using a Perl script to manage notifications

      here it's a clue about the problem:

      Last edited by lcondado; 20-06-2006, 20:33.

      Comment

      Working...