Ad Widget

Collapse

Zabbix 3.0.x stops SNMP checks after device is disconnected

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Hobbit
    Junior Member
    • Apr 2016
    • 7

    #1

    Zabbix 3.0.x stops SNMP checks after device is disconnected

    Hi all,

    i have one big trouble with zabbix.
    I`m checking some devices with (not only) SNMP.

    When device is disconnected (powered off, or there are some troubes on the way), zabbix stops polling this device via SNMP - and this device is not polled via SNMP agent for ~15 minutes.

    So the question: Where to decrease this timer, because ICMP polls are OK, but SNMP check are placed somewere into queue and are waiting (And shown in Administration -> Queue).
    After ~15 minutes timer is gone, SNMP polls are restored.

    PS: Sorry me, english is not my native language.
  • Hobbit
    Junior Member
    • Apr 2016
    • 7

    #2
    Hello, more infos:

    19:09 i`v disconected UPS from network
    190911.957 SNMP agent item "sysInVoltage" on host "xx.xx.144.20" failed: first network error, wait for 5 seconds
    190926.457 SNMP agent item "sysBattCapacity" on host "xx.xx.144.20" failed: another network error, wait for 5 seconds
    /after a few second snmp polls dissapeared/
    19:11 i`v connected UPS back to network and tried snmpwalk from bash - fully working
    194908.082 resuming SNMP agent checks on host "xx.xx.144.20": connection restored

    So question: why zabbix waits more than 30 minutes before tried to poll this UPS?

    Comment

    • ovas
      Senior Member
      Zabbix Certified Trainer
      Zabbix Certified SpecialistZabbix Certified Professional
      • Apr 2017
      • 138

      #3
      Hello Hobbit!

      Can you please show UnreachablePeriod, UnavailableDelay and UnreachableDelay settings from zabbix_server.conf?

      Comment

      • Hobbit
        Junior Member
        • Apr 2016
        • 7

        #4
        Hello,

        sure, i can :-)

        UnreachablePeriod=60
        UnreachableDelay=15
        UnavailableDelay=60
        StartPollersUnreachable=5

        I have something about 2500 devices polled from internal ICMP (ping) - it`s working correctly.
        Something about 100 devices are monitored via SNMP agent v1.
        Queue is empty, only devices displayed in queue are these, where SNMP agent fails.
        Last edited by Hobbit; 18-05-2017, 10:55.

        Comment

        • ovas
          Senior Member
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Apr 2017
          • 138

          #5
          Thank you!

          This is a bit strange, because you have:
          network error, wait for 5 seconds
          UnreachableDelay=15
          UnreachableDelay is exactly how often availability is checked in cace of lack of communication... But this is not the case. I thought you have UnavailableDelay set too high, but this is not the issue.

          Say please, do you experience problems with random hardware or only specific one/two/etc pieces? Are the issues connected only to specific item checks or all of them are going down from time to time?

          Comment

          • Hobbit
            Junior Member
            • Apr 2016
            • 7

            #6
            Hello,

            Problems are with all devices polled by SNMP agent - and are HW independent.

            But i think there is some trouble in zabbix queuing system, because device is NOT polled by SNMP agent, after it encounter two fails (as shown in log).

            On network monitor SNMP polls disapears - so zabbix is not polling this device.

            I have tried to show it on image (in attachement).
            Only shown traffic is SNMP polls from zabbix to one UPS.
            Polls are shown as blue lines - gap between them is 15 seconds (OK).
            Orange arrow: UPS was disconnected froom network (so is not reachable)
            Blue arrow: after first error zabbix tried second query with 5 seconds delay (OK). Next poll is last one, and this is ending - for 20-40 minutes.
            Black arrow: at this moment was UPS connected back to network.
            Attached Files

            Comment

            • Hobbit
              Junior Member
              • Apr 2016
              • 7

              #7
              any idea?
              Thanks.

              Comment

              Working...