I just upgraded to 1.8.5 and love the new process monitoring. I immediately set to creating some monitoring to see if I could alter my number of Pollers, Pingers, Trappers, Syncers, etc. to streamline the install and hopefully reduce unnecessary resource consumption.
I found something unusual very quickly. In the 12 hours since I got everything running, my timer process has been hovering around 50% busy, with some spikes to 70%, and one spike to 100%!
The graph for 12 hours reveals this:
The documentation says this:
Our current setup is:
Note: Most of the unknown triggers are from checks which only run daily and got missed by the Zabbix server restart. They're trivial (What version of zabbix_agentd, etc).
I created some scripts which automate the creation of maintenance windows. I built this based around the php code in the UI. It's worked well for us.
At any rate, I cleaned out the old crufty maintenances and it didn't reduce the timer usage at all (which was not unexpected). That leaves me to think that it's all about the triggers with time functions.
Question: What happens to trigger checks when the timer process is maxed out? There's no option to add timers without hacking the code (which I wouldn't do), so I would like to know how such a high amount of load would be handled.
It is feasible that if we doubled our current workload that would put much more strain on the timer process. Is anyone else operating under a similar strain to their timer process? I'm very curious now that 1.8.5 allows for trending and monitoring of this.
I found something unusual very quickly. In the 12 hours since I got everything running, my timer process has been hovering around 50% busy, with some spikes to 70%, and one spike to 100%!
The graph for 12 hours reveals this:
Code:
Last Min Avg. Max 72 42 49.01 100
timer - process for evaluation of time-related trigger functions and maintenances
Code:
Zabbix server is running Yes localhost:10051 Number of hosts (monitored/not monitored/templates) 740 642 / 0 / 98 Number of items (monitored/disabled/not supported) 26807 26712 / 45 / 50 Number of triggers (enabled/disabled)[problem/unknown/ok] 7187 7166 / 21 [2 / 197 / 6967] Number of users (online) 60 4 Required server performance, new values per second 434.64 -
I created some scripts which automate the creation of maintenance windows. I built this based around the php code in the UI. It's worked well for us.
At any rate, I cleaned out the old crufty maintenances and it didn't reduce the timer usage at all (which was not unexpected). That leaves me to think that it's all about the triggers with time functions.
Question: What happens to trigger checks when the timer process is maxed out? There's no option to add timers without hacking the code (which I wouldn't do), so I would like to know how such a high amount of load would be handled.
It is feasible that if we doubled our current workload that would put much more strain on the timer process. Is anyone else operating under a similar strain to their timer process? I'm very curious now that 1.8.5 allows for trending and monitoring of this.
Comment