Ad Widget

Collapse

Active Agent pauses & continues for some items causing falase positives

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bbrendon
    Senior Member
    • Sep 2005
    • 870

    #1

    Active Agent pauses & continues for some items causing falase positives

    Okay. Just caught this agent red handed. Its linux agent v1.1.6.

    Connection looks like:
    Zabbix Server <--> Internet <--> Firewall/NAT <--> Linux Agent

    The internet connectivity didn't go down. This is monitored very closely using two programs.

    Here are the results from an agent ping item that runs every thirty seconds.
    Code:
    2007.Aug.14 15:05:23	1
    2007.Aug.14 15:04:53	1
    2007.Aug.14 15:04:23	1
    2007.Aug.14 15:03:53	1
    2007.Aug.14 14:47:26	1
    2007.Aug.14 14:46:56	1
    During the large time gap above, almost all items for the server weren't collecting data. There are about 8 agents in total on the same LAN as this one. None of the other systems triggered as being down.

    This happens to other agents as well every so often.

    How can I stop stuff like this from happening?
    Last edited by bbrendon; 15-08-2007, 19:21.
    Unofficial Zabbix Expert
    Blog, Corporate Site
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    You may try to increase number of ZABBIX trappers. It may happen that all trappers are busy at some point, so new connections are not accepted.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • JoelG
      Member
      • Aug 2007
      • 32

      #3
      trap or ping?

      Is it trappers or pingers? I have this exact problem, the interval for pings is anywhere from 30 seconds (what we want) to as much as 3 minutes (not what we want).

      I have StartPingers=15, but my StartTrappers=5

      Code:
      2007.Aug.15 12:20:06	Up (1)
      2007.Aug.15 12:18:23	Up (1)
      2007.Aug.15 12:16:39	Up (1)
      2007.Aug.15 12:14:52	Up (1)
      2007.Aug.15 12:14:11	Up (1)
      2007.Aug.15 12:12:27	Up (1)
      2007.Aug.15 12:10:43	Up (1)
      2007.Aug.15 12:10:01	Up (1)
      2007.Aug.15 12:07:15	Up (1)
      2007.Aug.15 12:06:33	Up (1)
      2007.Aug.15 12:05:51	Up (1)
      2007.Aug.15 12:04:04	Up (1)
      2007.Aug.15 12:02:23	Up (1)
      2007.Aug.15 12:01:41	Up (1)
      2007.Aug.15 11:59:57	Up (1)
      2007.Aug.15 11:57:11	Up (1)
      2007.Aug.15 11:56:29	Up (1)
      2007.Aug.15 11:55:47	Up (1)
      2007.Aug.15 11:52:53	Up (1)
      2007.Aug.15 11:52:12	Up (1)
      2007.Aug.15 11:51:30	Up (1)
      2007.Aug.15 11:49:46	Up (1)

      Comment

      • bbrendon
        Senior Member
        • Sep 2005
        • 870

        #4
        Originally posted by Alexei
        You may try to increase number of ZABBIX trappers. It may happen that all trappers are busy at some point, so new connections are not accepted.
        That was my suspicion as well. After posting I added a simple check item with key value of "tcp,10051" to the host running zabbix_server.

        I'm not going to increase trappers at this time. I'm going to wait for this to happen again and hope that this new item correlates with the problem.

        Fingers crossed!
        Unofficial Zabbix Expert
        Blog, Corporate Site

        Comment

        • bbrendon
          Senior Member
          • Sep 2005
          • 870

          #5
          The simple check mentioned above has been in good status 100% of the time and this has happened a few more times since. I have now added an item: net.tcp.port[x.x.x.x,10051]

          This was added to an agent behind the NAT/Firewall. Hopefully this comes up with something.
          Unofficial Zabbix Expert
          Blog, Corporate Site

          Comment

          Working...