Hello,
I have been using Zabbix for a few months now (latest along the way) and have always found it to have a glitch with items going into unsupported mode. Also graphs are occasionally broken for no reason (especially ping graphs).
I could always set a daily script to reset them (and i know that it is a todo for the dev) but I cannot figure out why it would have a problem in the first place.
- Ensured network was always OK; had some ping test issues but they are since gone
- Checked zabbix_agentd log files for any signs but nothing (in show all warnings mode) comes up. Only if it is in debug mode do I get any decent results but never do I find anything useful except that it returns unsupported.
Steps taken to improve reliability:
> Enabled NoTimeWait on Client and Server. Not sure what this does but it seems to help with FreeBSD open connections, network traffic now "feels" healthier.
> I slowly raised timeouts until i hit 30secs (50% improvement)
> Optimised any custom scripts so they would be outputing as fast as possible (<1s).
> Staggered zabbix items so they dont all hit on 5,10 or 30 second increments. Why is this needed though? Wouldnt zabbix automatically do this??? (40% improvement)
> Generally lowered frequency checks - 35% improvement
> Optimised system (make world on BSD) - 10% improvement
> Increased StartSuckers to 25
All of the server items are configured to poll the clients. System load is generally very good (~0.30), no flakyness anywhere. Running FreeBSD 5.4-REL. MySQL 5 (same problems with 4.x). Only monitoring 5-6 servers at the moment (until I can get stability up).
Are there any ideas people can offer to make it "just work" without having to worry about if i need some stats/triggers, if it will be OK.
Thanks!
I have been using Zabbix for a few months now (latest along the way) and have always found it to have a glitch with items going into unsupported mode. Also graphs are occasionally broken for no reason (especially ping graphs).
I could always set a daily script to reset them (and i know that it is a todo for the dev) but I cannot figure out why it would have a problem in the first place.
- Ensured network was always OK; had some ping test issues but they are since gone
- Checked zabbix_agentd log files for any signs but nothing (in show all warnings mode) comes up. Only if it is in debug mode do I get any decent results but never do I find anything useful except that it returns unsupported.
Steps taken to improve reliability:
> Enabled NoTimeWait on Client and Server. Not sure what this does but it seems to help with FreeBSD open connections, network traffic now "feels" healthier.
> I slowly raised timeouts until i hit 30secs (50% improvement)
> Optimised any custom scripts so they would be outputing as fast as possible (<1s).
> Staggered zabbix items so they dont all hit on 5,10 or 30 second increments. Why is this needed though? Wouldnt zabbix automatically do this??? (40% improvement)
> Generally lowered frequency checks - 35% improvement
> Optimised system (make world on BSD) - 10% improvement
> Increased StartSuckers to 25
All of the server items are configured to poll the clients. System load is generally very good (~0.30), no flakyness anywhere. Running FreeBSD 5.4-REL. MySQL 5 (same problems with 4.x). Only monitoring 5-6 servers at the moment (until I can get stability up).
Are there any ideas people can offer to make it "just work" without having to worry about if i need some stats/triggers, if it will be OK.
Thanks!
Comment