Ad Widget

Collapse

Proxy - Server - Env Problem

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mjpirez
    Junior Member
    • Dec 2014
    • 7

    #1

    Proxy - Server - Env Problem

    Hi everyone,

    I have a test environment with 50 devices polling info via SNMP, using Template SNMP Generic and Template ICMP Ping only. For 5 devices I'm polling for interface info (Template SNMP Interface).

    The env is set up with a server that doesn't collect anything and 2 proxies, one IDLE (polling itself only) and the other polling everything mentioned above.

    - server config: 4 cores, 4Gb de RAM, mysql, ubuntu 14.04
    - proxy config: 4 cores, 4Gb de RAM, sqlite, centos 6.6
    - all of them VM
    - they are physically apart by 1900 miles with a ping average of 54.148ms

    Everything was installed via official packages. MySQL config was not changed.


    Here is the question: Why my Queue is not near zero with this config?

    Server Stats:


    Proxy graph show no issues in queue:


    However, the server does:


    If we look at the Queue screen, it says the problem is in the proxy:


    Listing the queue detail we see:


    The latest data shows it was collected, although it was listed as scheduled:


    But if we look at some interface graphs it shows gaps:


    This gaps, for what I looked in the forum, indicates problems in the queue - performance problems.

    This is what is "grinding my gears". None of the servers are presenting performance problems (not even 5% CPU utilization), memory is fine, no mysql obvious problems.

    Anything I am missing that someone could help?

    Thanks a lot and sorry for long post.
  • mjpirez
    Junior Member
    • Dec 2014
    • 7

    #2
    Sorry, forgot to mention: Zabbix 2.4.2

    Comment

    • gleepwurp
      Senior Member
      • Mar 2014
      • 119

      #3
      Hi,

      I've had some similar experiences. Usually the problem is with the devices being polled, and not the proxies doing the polling.

      Devices that stops answering requests tend to drag up the queue dramatically. We have some huge switches that had huge number of interfaces, so when the device stalls/stops responding to SNMP requests, the queue jumps up dramatically...

      How often is the polling done on your devices? It might be caused by an overly aggressive polling cycle, combined with slow device responsiveness and lots of items to collect metrics...

      Check with an SNMPwalk of the devices to see if they dump info real quick, or if they're really slow.

      Also, using the "bulk request" option in your SNMP Items will help with the performance.

      Good luck!

      G.

      Comment

      • mjpirez
        Junior Member
        • Dec 2014
        • 7

        #4
        Hi gleepwurp,

        I had the same thoughts, and I agree with you. But it is such a small environment that is
        really kind of a disappointment.

        The majority of the items it's being polled in 1 hour to 1 day cycles, the Interfaces
        (about 500 items) in 5 min cycles. Only ICMP info is being checked in 60 sec intervals (and is not showing in 'queue overview').
        In the mean time, I heard that such high intervals (86400 - 1 day) is not good practice, and
        that is better to use 'flexible intervals'. Can anyone confirm that?

        I have enterprise NMS systems polling the same devices in the same network and I am not
        having this issue. snmpwalk also ok, and bulk request is checked for every host.

        Although the proxy performance graph shows no issues, I'm trying to tweak zabbix_proxy.conf
        to see if I get some improvement.

        I'll post here if something changes.

        Thanks.

        Comment

        • gleepwurp
          Senior Member
          • Mar 2014
          • 119

          #5
          Hi,

          there's a small caveat with using the "Bulk SNMP Request" in that not all devices supports it flawlessly.

          If you see a device that you're having issues getting values from it, then disabling the Bulk SNMP Request might fix it.

          About the only thing that have more than an hourly cycle in my environment is the discovery... everything else is pretty much hourly at the maximum, and 60 seconds at the minimum.

          I've had devices with long cycle time frequently getting on my "10 minutes+" queue. I suspect that as soon as the snmpget is done, its put in the queue, where it grows your queue number... I haven't had the opportunity to actually dig in and do some extensive testing though, so my claim here might not be accurate at all...

          If you have a large number of devices, increasing the number of pollers will definitely help!

          G.

          Comment

          Working...