Ad Widget

Collapse

1.8.3 SNMP performance

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Peteris
    Member
    • Feb 2010
    • 89

    #1

    1.8.3 SNMP performance

    Hi,

    Is there any way to boost SNMP performance on Zabbix 1.8.3. server. I have about 2200 SNMP v1 items.

    Total info:
    Number of hosts (monitored/not monitored/templates) 247 222 / 0 / 25
    Number of items (monitored/disabled/not supported) 9980 8762 / 881 / 337
    Required server performance, new values per second 138.52

    And queue is pretty big at this point:



    DB writing seems to be OK:

    I'm planing to add some more, but I really concerned about performance!
    Last edited by Peteris; 07-09-2010, 08:28.
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    It does look like an insufficient number of pollers.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • Peteris
      Member
      • Feb 2010
      • 89

      #3
      I could not find pollers variable that are responsible for SNMP items.

      Configuration looks like this at the moment:
      StartPollers=50
      StartTrappers=35


      What variable should I change to boost performance?

      Comment

      • Peteris
        Member
        • Feb 2010
        • 89

        #4
        Increased StartPollers to 100, still no effect.

        Comment

        • bashman
          Senior Member
          • Dec 2009
          • 432

          #5
          Which is your polling interval?, may be if you increase your minimum polling interval you'll notice an increase of zabbix queue performance.
          978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

          Comment

          • Peteris
            Member
            • Feb 2010
            • 89

            #6
            For ~700 items polling interval is 60 sec. and ~1400 items has polling interval of 180 sec.

            It's for 15 devices, is that to much? Please share your experience.

            Comment

            • bashman
              Senior Member
              • Dec 2009
              • 432

              #7
              Well, I would try to increase the 60 seconds interval to 90 seconds and see what happens.

              Do you see any timeout error in zabbix_server.log?

              I think your StartPollers and StartTrappers configuration is too high, I would try to decrease to 30 and 30 as the maximum StartPollers and StartTrappers.
              978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

              Comment

              • Peteris
                Member
                • Feb 2010
                • 89

                #8
                Changed time from 60 to 90 sec.

                I was getting quite a lot timeout so I increased SNMP timeout from 5 to 15

                Alexei said that it seems to be insufficient number of pollers, at that point I had 50 running, now I set the variable to 75.

                Do all changes look ok?

                Comment

                • bashman
                  Senior Member
                  • Dec 2009
                  • 432

                  #9
                  Yeah, if you say that it's not an IO problem, the changes seem OK.

                  You can try an IO benchmark with: hdparm -t /dev/sda or iotop.

                  If you see high IO, you can try to tune MySQL.

                  If you have a high average number of online users you can tune Apache and Zabbix front-end.
                  978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

                  Comment

                  • Peteris
                    Member
                    • Feb 2010
                    • 89

                    #10
                    I'm using Oracle DB server, which is located on another physical machine.

                    Write/Read cache:

                    Comment

                    • bashman
                      Senior Member
                      • Dec 2009
                      • 432

                      #11
                      Ok, it seems that you don't have an IO problem, but you could try to do iotop on that host, and tell me what do you see.
                      978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

                      Comment

                      • Peteris
                        Member
                        • Feb 2010
                        • 89

                        #12
                        I don't have iotop utility on my server.

                        Queue is still big:


                        Zabbix queue on graph:


                        Alexei what would be your suggestion ?

                        Comment

                        • bashman
                          Senior Member
                          • Dec 2009
                          • 432

                          #13
                          I think you must verify how high is your IO where the DB is located, because a high IO can cause bad Zabbix queue performance.
                          978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

                          Comment

                          • Peteris
                            Member
                            • Feb 2010
                            • 89

                            #14
                            Problem seems to be solved.

                            We went trough all the items we had monitored and deleted those who aren't so critical.

                            From ~150 items (for 1 device) we deleted ~100 (in this case error in/out, discards in/out) and changed interval for those 50 items (Traffic in/out) from 1 min to 2 min. Probably 1 min. would work, but we went safest way.

                            We added error in/out and discard items only for those ports which are used as up-links and interval was set to 5min.

                            Conclusion: problem was not in Zabbix performance, it's SNMP device performance related issue. It simply can't respond to that many requests. Of course it depends on device and it's performance. In our case switches were not latest and high-performance models.

                            P.S. SNMP v1 was used.

                            Comment

                            • bashman
                              Senior Member
                              • Dec 2009
                              • 432

                              #15
                              Great to hear that!.
                              978 Hosts / 16.901 Items / 8.703 Triggers / 44 usr / 90,59 nvps / v1.8.15

                              Comment

                              Working...