Ad Widget

Collapse

False positives & actives agents

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bbrendon
    Senior Member
    • Sep 2005
    • 870

    #1

    False positives & actives agents

    I have many active agents running around on the internet and I'm trying to eliminate false positives with little success. I have agents checking into the server often and connectivity seems fine but I still get a lot of false positives.

    The information below is puzzling to me as to why I'm constantly having agents not reporting data or not reliably reporting data.

    First, lets assume the zabbix server and the server with the active agent are at tier 1 colo facilities. In my case, they are 9 hops.

    In this case, they are:
    - all win32 agents.
    - all times are accurate (checked clocks)

    Relatively often, I'll get an event saying a system is down, for example:
    2006.Dec.08 23:36:33 server1 is unresponsive ON

    Server1's agent log. Notice the agent log doesn't say anything about not being able to connect within 12 minutes (nodata trigger) of when the trigger was activated.

    [08-Dec-2006 22:54:27] Active checks [Error in connect()]
    [08-Dec-2006 23:07:33] Active checks [Cannot connect to [zabbixserver:10051] [No error]]
    [08-Dec-2006 23:09:06] Active checks [Error in connect()]
    [08-Dec-2006 23:19:36] Active checks [Error in connect()]

    The trigger to alert is:
    {server1:system.cpu.util[].nodata(755)}

    The system.cpu.util item has an update interval of 150 seconds.

    I shouldn't have any false positives because two colo facilities are almost never down for 12 minutes. This problem occurs frequently and I'm not sure what else I can try next except for extending the nodata parameter in the trigger, but 12 minutes should be more than enough.

    And ideas are appreciated.
    Unofficial Zabbix Expert
    Blog, Corporate Site
Working...