Ad Widget

Collapse

Making agent more resilient?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • CeeEss
    Senior Member
    Zabbix Certified Specialist
    • Nov 2007
    • 103

    #1

    Making agent more resilient?

    I have several render servers that regularily hit peak load averages of 15-30 and become insensible to all stimuli other that what they're working on. When Zabbix tries to contact these systems' (passive) agents, it gets no response for "host name (or data) has changed" and fires a trigger even though no usable data was returned.

    Host name:

    2011.Oct.11 12:49:27 18.bb
    2011.Oct.11 12:19:32 timeout while executing a shell script
    2011.Oct.11 11:49:27 18.bbl

    I've added a regex to reject "timeout while executing a shell script" for Host name / Host info triggers. Only these 2 triggers are affected. agent.ping and agent.version are both unaffected and even processor load avg[x] reports correctly during timeouts. Can agent be improved to prevent these false positives?

    cheers
    Last edited by CeeEss; 02-04-2012, 11:44.
Working...