Hi everyone, I have a question I hope you can help me with. We currently have a Zabbix node-based distributed monitoring configuration with 8 regional child nodes and a single central master. We monitor about 1500 machines, 70000 items, and run about 800 values per second on the master.
We are looking to move to a proxy-based configuration due to some parent-child replication problems we've run into, and also because nodes are unsupported and are going away in the near future. However, our biggest concern is that a proxy-based configuration seems a bit less 'survivable' than nodes. We have all email actions configured on the child nodes so that if we lose the master, we still have visibility and emails on all nodes. We just lose the single pane of glass (if that makes any sense).
I was hoping to hear some 'success stories' from other Zabbix users that run a proxy-based distributed config in an environment our size or larger. Specifically, how is everything from an availability perspective? Do you find that a 2-node HA master is 'enough' redundancy? Does replication between the two nodes keep up with your workload? Anything else to add?
It would be great to hear from other people's experience before we take the plunge for ourselves. Let me know what you think. Thanks for the help!
We are looking to move to a proxy-based configuration due to some parent-child replication problems we've run into, and also because nodes are unsupported and are going away in the near future. However, our biggest concern is that a proxy-based configuration seems a bit less 'survivable' than nodes. We have all email actions configured on the child nodes so that if we lose the master, we still have visibility and emails on all nodes. We just lose the single pane of glass (if that makes any sense).
I was hoping to hear some 'success stories' from other Zabbix users that run a proxy-based distributed config in an environment our size or larger. Specifically, how is everything from an availability perspective? Do you find that a 2-node HA master is 'enough' redundancy? Does replication between the two nodes keep up with your workload? Anything else to add?
It would be great to hear from other people's experience before we take the plunge for ourselves. Let me know what you think. Thanks for the help!
Comment