I've been encountering some hurdles while managing a large-scale Zabbix environment with thousands of devices and significant data throughput. The specific issues revolve around scaling, performance, high availability (HA), and maintenance. Despite going through the documentation, I'm seeking insights and practical advice from the experienced members of this community.
Here are some specific details on the challenges I'm facing:: As the number of devices increases, I'm noticing potential scalability issues. What are the best practices for scaling Zabbix in such environments?
With a high volume of data (1K+ values per second), I'm observing performance bottlenecks. Any tips on optimizing performance for large-scale deployments?Ensuring uninterrupted monitoring is crucial. What strategies do you recommend for achieving high availability in Zabbix? Routine maintenance tasks become more complex in large environments. How do you manage maintenance activities efficiently without impacting monitoring?
I'm eager to hear from those who have successfully managed Zabbix in similar large-scale setups. Your experiences, suggestions, and best practices would be incredibly valuable.
Here are some specific details on the challenges I'm facing:: As the number of devices increases, I'm noticing potential scalability issues. What are the best practices for scaling Zabbix in such environments?
With a high volume of data (1K+ values per second), I'm observing performance bottlenecks. Any tips on optimizing performance for large-scale deployments?Ensuring uninterrupted monitoring is crucial. What strategies do you recommend for achieving high availability in Zabbix? Routine maintenance tasks become more complex in large environments. How do you manage maintenance activities efficiently without impacting monitoring?
I'm eager to hear from those who have successfully managed Zabbix in similar large-scale setups. Your experiences, suggestions, and best practices would be incredibly valuable.
Describe, what you already have, what HW setup, versions etc.. It would be easier to give suggestions, if there are maybe obvious bottlenecks visible.
Comment