Ad Widget

Collapse

nodata trigger providing false positives

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jameskirsop
    Member
    • Jul 2018
    • 32

    #1

    nodata trigger providing false positives

    I've added the following trigger to the default Template_Windows template:

    nodata(/Template_Windows/agent.ping,180)<>0

    The goal is to know when/if the agent becomes unavailable and send a Notification so we can investigate further.

    However, despite knowing that the agent isn't becoming unavailable, I'm seeing Problems generated on this trigger for multiple hosts.

    The hosts in question are in two groups - each group monitored by a different proxy, and the Queue will often show many items in the 1minute column, but rarely in the 5 minute column. I could assume that these checks are just building up in the queue and triggering the event because the Server is waiting for the latest data, but I'm not sure how to check if that's actually the case... or if it's how the nodata trigger works. I'm also not sure why there's so much build up of items on the proxy - the processes on the proxy all seem to be well under-utilised.

    Sample utilisation graph from one proxy:​

    Click image for larger version

Name:	chart2.php.png
Views:	802
Size:	77.8 KB
ID:	451025

    Is there anything I need to do to tune for nodata to be able to report more accurately?
  • cle
    Junior Member
    • Sep 2022
    • 3

    #2
    Just letting you know that I'm in the same boat - but we're not running a Zabbix Proxy...

    Also, our nodata Trigger is set to 30m - however, i think there's at least one (out of about 250 monitored servers) a day that isn't reporting back...

    The only thing that helps then is to stop/start (or restart) the Zabbix Agent service.

    All Agents are on version 6.0.6, server is on 6.0.6 but this also has happened on 5.0.x before.

    Oh and regarding the restart of the Zabbix service - sometime that does timeout (while stopping the Agent) and then Windows just lets you know it failed to stop it - i've to manually start it again then...

    Somehow, i've the hunch that this behaviour started when i added disk-related items (i.e. size / fill monitoring).

    Comment

    • jameskirsop
      Member
      • Jul 2018
      • 32

      #3
      Thanks cle . Glad to know I'm not alone.

      I've noticed that a lot of my Problems that I described above have a negative duration - which seems to indicate that the proxy isn't getting the data to the server fast enough. I can't see anything in the Server or Proxy logs indicating a bottleneck - nor can I see anything out of the ordinary in the graphs for the 2 proxies in question.

      My intuition says that my issues are from some form of proxy -> server performance problem, but I'm really struggling to work out what the source of that problem might be.

      Comment

      • cle
        Junior Member
        • Sep 2022
        • 3

        #4
        Originally posted by jameskirsop
        I've noticed that a lot of my Problems that I described above have a negative duration
        The few cases i've seen negative durations before where due to servers having a wrong local time set - i.e. if the server reporting the problem is a few minutes back from the Zabbix server (NTP not set up and time has drifted). Not sure if that is the case for you, and as i said before, we're running without proxys...

        Comment

        Working...