I am using zabbix 2.2 and would like to setup SNMP monitoring to monitor all servers. I would like to know if there are any easy way so that I can aggregate the hrProcessorLoad to produce something similar to system.cpu.load[percpu,avg1] which is the CPU loading average for all CPU. This is really necessary in order to setup generic alert on all SNMP monitored host to alarm administrator on exceptional high aggregated CPU usage.
Ad Widget
Collapse
SNMP CPU hrProcessorLoad aggregation
Collapse
X
-
Tags: None
-
Yes please. I spent time trying to find some way to do this, looking at the group aggregation functions, etc., without luck. Much of zabbix postings/doc relate to the agent, but in some environments, putting an agent on a server is not acceptable. And in others it would still be helpful (if I recall some of Cisco's WAAS devices are multi-CPU for example).
Practically no windows software uses CPU affinity, average CPU across cores is much more useful.
I'm off to look at scripting it manually, but if there's a point/click solution, please! -
Loadable modules to access existing data collected?
I've been playing with this. I've successfully written a loadable module which when given a base OID (e.g. HOST-RESOURCES-MIB::hrProcessorLoad) will walk that MIB to the end of the indexed section and average the values it finds. It's pretty basic (e.g. at the moment only does integers, and light on error processing and timeouts), but it works and provides average CPU.
But it occurs to me that this might not be the best approach, and wonder if others have another approach based on LLD data collected.
For example, if you use the default SNMP templates it will find all the processors, and collect performance for them. Maybe what this loadable module should do is, instead of querying for new SNMP data, just do a database operation on existing data, to produce a new (aggregated) key. This way it could do rolling averages (etc) based no prior data not just latest, and this way it doesn't need to re-query the SNMP data.
Has anyone run across modules like that, maybe an example? I'm about to give it a try, but would rather not re-invent the wheel. in particular, any way to capitalize on Zabbix' database connections so as to write a module that is reasonably database independent, e.g. let the existing routines do all the data access work that's DB specific.
PS. Happy to share the nascent SNMP routine if anyone wants to critique or use, but it's not nearly a production ready thing, it was just a proof of concept.Comment
-
The problem is knowing how many to use. The above works fine for a single host that you hard code (well, until someone flips the "hyper-thread" switch in bios, or adds CPU's) but doesn't work well for a template. Especially if what you want to do is divide. I'm trying very hard to approach zabbix as "if I can't do it in a template that is generally useful I'm doing it wrongly".
I want an alarm for CPU only if overall the load is excessive for a period of time. So I need to know load averaged over both all processors and for some rolling time period.
If calculations could take wildcards that would work fine, but I don't think they can. Can they?
I spent some time looking at database operations, e.g. in the ODBC component, and while I can make it work, I worry about the whole issue of database dependency. Same if you do the connection yourself in a loadable module, though there (maybe) you could cache them. Plus the database overhead, possible locking conflicts.
I think I'm going to go back to making the SNMP query more production ready to do averages and sums; I left it running about 24 hours to see if it's stable, so far nothing bad. What I'd do then is remove the processor template entirely, and use this instead -- I really don't care about individual processor performance.
Disks, memory, and ports really are different -- I suspect there aggregates are relatively unimportant, individuals matter. It's really CPU, I think, that is a special case. Though I guess it's possible there are some other things that are similar in indexed SNMP structures.Comment
Comment