PDA

View Full Version : 1.8.3 SNMP performance


Peteris
06-09-2010, 14:26
Hi,

Is there any way to boost SNMP performance on Zabbix 1.8.3. server. I have about 2200 SNMP v1 items.

Total info:
Number of hosts (monitored/not monitored/templates) 247 222 / 0 / 25
Number of items (monitored/disabled/not supported) 9980 8762 / 881 / 337
Required server performance, new values per second 138.52

And queue is pretty big at this point:

http://i55.tinypic.com/25hh407.jpg

DB writing seems to be OK:

I'm planing to add some more, but I really concerned about performance!

Alexei
06-09-2010, 16:49
It does look like an insufficient number of pollers.

Peteris
06-09-2010, 16:56
I could not find pollers variable that are responsible for SNMP items.

Configuration looks like this at the moment:
StartPollers=50
StartTrappers=35


What variable should I change to boost performance?

Peteris
07-09-2010, 08:29
Increased StartPollers to 100, still no effect.

bashman
07-09-2010, 09:38
Which is your polling interval?, may be if you increase your minimum polling interval you'll notice an increase of zabbix queue performance.

Peteris
07-09-2010, 09:44
For ~700 items polling interval is 60 sec. and ~1400 items has polling interval of 180 sec.

It's for 15 devices, is that to much? Please share your experience.

bashman
07-09-2010, 09:48
Well, I would try to increase the 60 seconds interval to 90 seconds and see what happens.

Do you see any timeout error in zabbix_server.log?

I think your StartPollers and StartTrappers configuration is too high, I would try to decrease to 30 and 30 as the maximum StartPollers and StartTrappers.

Peteris
07-09-2010, 10:05
Changed time from 60 to 90 sec.

I was getting quite a lot timeout so I increased SNMP timeout from 5 to 15

Alexei said that it seems to be insufficient number of pollers, at that point I had 50 running, now I set the variable to 75.

Do all changes look ok?

bashman
07-09-2010, 10:11
Yeah, if you say that it's not an IO problem, the changes seem OK.

You can try an IO benchmark with: hdparm -t /dev/sda or iotop.

If you see high IO, you can try to tune MySQL.

If you have a high average number of online users you can tune Apache and Zabbix front-end.

Peteris
07-09-2010, 10:19
I'm using Oracle DB server, which is located on another physical machine.

Write/Read cache:
http://i54.tinypic.com/1zgxlvr.jpg

bashman
07-09-2010, 10:21
Ok, it seems that you don't have an IO problem, but you could try to do iotop on that host, and tell me what do you see.

Peteris
07-09-2010, 13:54
I don't have iotop utility on my server.

Queue is still big:
http://i55.tinypic.com/16ksu1k.jpg

Zabbix queue on graph:
http://img825.imageshack.us/img825/5815/qgraph.png

Alexei what would be your suggestion ?

bashman
08-09-2010, 13:25
I think you must verify how high is your IO where the DB is located, because a high IO can cause bad Zabbix queue performance.

Peteris
08-09-2010, 13:41
Problem seems to be solved.

We went trough all the items we had monitored and deleted those who aren't so critical.

From ~150 items (for 1 device) we deleted ~100 (in this case error in/out, discards in/out) and changed interval for those 50 items (Traffic in/out) from 1 min to 2 min. Probably 1 min. would work, but we went safest way.

We added error in/out and discard items only for those ports which are used as up-links and interval was set to 5min.

Conclusion: problem was not in Zabbix performance, it's SNMP device performance related issue. It simply can't respond to that many requests. Of course it depends on device and it's performance. In our case switches were not latest and high-performance models.

P.S. SNMP v1 was used.

bashman
09-09-2010, 08:47
Great to hear that!.