Ad Widget

Collapse

Reducing icmp pinger processes load

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ILIV
    Junior Member
    • Oct 2012
    • 28

    #1

    Reducing icmp pinger processes load

    Before I say anything, let me say that I've tried various forum search queries and found literally nothing on this topic (wrong choice of keywords?)

    Anyway, my question is about icmp pinger processes load.

    I have this bare metal server running Zabbix 2.0.7, Zabbixe Server and Agents in their almost pristine, default configuration, monitoring a total of 31 hosts on a LAN.

    There are two items that I use to simulate more or less realistic load of an application's network communication, hence 1024 bytes packet size and somewhat intense througput of 20 ms for 100 packets:

    icmppingloss[,100,20,1024,1000]
    icmppingsec[,100,20,1024,1000,avg]

    The problem is that with default setting StartPingers of 1, icmp pinger processes are at least 100% busy all the time.

    I set out to drive down this load as far as I can. I've achieved some level of success but ultimately I'm not happy about the results.

    What I tried to do was:

    * change interval from 20 to 100 ms
    * change packet size to default 56 bytes

    These seem to be the most important ones, but they literally didn't have any effect on the icmp pingers processes whatsoever.

    What made a difference, though, was increasing StartPingers up to 20 and higer -- 30 is very good and reduces the load from 100% to mere ~12%.

    However, increasing the number of pingers causes higher number of almost exclusively IDLE open PostgreSQL connections.

    Anyway, this is just to give you an idea of where I'm coming from with the following question.

    I can't really judge whether a load of avg 8% and max 15% produced by the two above mentioned items with interval of 120 seconds, with StartPingers set to 20 is a good one, or a bad one.

    I mean, 8%-15% for 31 hosts seems like a big number. What kind of load is going to be generated on a huge setup with more than a thousand of hosts?

    This thought keeps bothering me lol, because it most likely means that my configuration is far from perfect, but I haven't found any other way to reduce the load except by way of increasing number of StartPingers.

    So, can you guys give me examples of how large your load for icmp pingers is and describe what your Zabbix Server configuration is?

    If you know what's wrong with my configuration, don't hesitate to point it out lol
  • Pada
    Senior Member
    • Apr 2012
    • 236

    #2
    We typically don't use ICMP monitoring inside our network, because we're monitoring the errors/loss reported by the routers.

    We do however use ICMP to monitor hosts outside our control to check our Internet connectivity, in which case its typically 2-3 hosts on the Internet per data center.

    Our typical setting for monitoring latency is every 5 seconds, with the key: "icmppingsec[,3,100,,1500]"
    I did increase our Zabbix server's StartPingers to 5, because as soon as hosts go offline, it cannot accurately keep track of the online hosts. We have about 60 hosts that we monitor in total, across 3 data centers and the icmp pinger buzy value sits at 35%.

    My friends who are running a Wireless ISP is using Smokeping to graph/monitor their network connectivity, which seems to work quite well.

    If you don't want to go that route and want to reduce the amount of idle DB connections, you can always create extra Zabbix proxies with local DB's, because they'll then do the ICMP stuff for you.

    Comment

    • ILIV
      Junior Member
      • Oct 2012
      • 28

      #3
      Thanks for the input Pada. It was quite interesting to read about your setup. I encourage more people to share this kind of information that we all can see what kind of load to expect in different types of configurations.

      Comment

      • timbo
        Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2013
        • 50

        #4
        Hi ILIV,

        Firstly I'd like to state that I am entirely unqualified to answer this, but I do have a couple of queries.

        Would it possibly be better to monitor the host NIC throughput via (net.if.in, net.if.out & net.if.total / perf_counter in Windows)? Thus avoiding flooding the network with (potentially) unnecessary ICMP traffic.

        That is of course if you're looking to monitor. Though it sounds as if you're interested in load testing, of which Ping may not be the best tool. Obviously ping is designed to test connectivity and round trip time, but don't forget that hosts/routers will drop ICMP packets if they're experiencing a high load, and the hosts are required to send back exactly the same data that was sent to them.

        Perhaps this is your problem - you have one server punching out 100 x icmppingloss, 100 x icmppingsec to 31 hosts (potentially 6200 packets x 1024bytes per cycle), then the server needs to receive (and process) all these request back from the hosts (potentially DDOSing itself). As they all come back in the server needs to read 6200 packets x 1024 byte to ensure that they have arrived unaltered/uncorrupted. I have no idea if the load created from this would significant or not, it may not be, just a thought.

        I can't suggest the best (or even a better) tool/method to load test your hosts, but from experience I'd normally gather a baseline of acceptable (or current) throughput (using a monitoring system such as Zabbix). Then perform a one off (weekly/monthly/yearly) load test (using something appropriate to your situation/application). Using the results of the load test, you can then identify the loads/levels at which you would like to be notified, and then perhaps you can create a trigger/action in Zabbix accordingly.

        Sorry if I missed the point entirely and ICMP is required in this situation, but personally I limit its use to testing for connectivity, packet loss and latency, not testing network load.

        -Timbo

        Comment

        • ILIV
          Junior Member
          • Oct 2012
          • 28

          #5
          Simply put, I've seen time and again that a ping with default settings (1 packet of 56 bytes per 1 second) may fail to reliably demonstrate issues on a congested link or an unreliable connection.

          Running ping with a larger packet size is going to give one a better idea of how data intensive applications are going to do when their doing their typical business. In most cases, we're all interested in how the actual applications, increasingly data intensive ones, are going to perform, not just the networking stack.

          In my experience, 100 packets sent out rapidly at speed of anywhere from 10-100ms in adaptive mode with preferably larger enough packet size gives me at least approximately good idea about the real state of a link. Anything less may result in an incomplete picture.

          I've just learned from experience that this set of options for ping can make all the difference. Depending on what medium is used: ethernet, ffx, dsl, wireless etc. -- especially wireless -- it can be very important. And thus I use this appoach intentionally to test link stability on the LAN.

          It is not about emulating an application workload.

          Comment

          Working...