Hello,
I am searching for advices and best practices about Zabbix related to HA, scalability & security.
I'm working on a project for which we will have to automatically deploy several VM on a IAAS provider. At the start we will only have to manage hundreds of virtual servers but I hope we could increase to thousands hosts over the next years.
I plan to install infra services (monitoring, conifg management, databases) on dedicated servers.
- 2 Zabbix servers (active/passive)
Intel Xeon W3520
24 Go DDR3
SOFT RAID1 on SATA2 disks
- 2 DB servers MySQL master/slave replication
Intel Xeon E5-1620
64 Go DDR3 ECC
SOFT RAID1 on 2x120Go SSD
I've read docs about MySQL replication based on DRBD but I think that while we perform backup on database it will impact the production performance. It won't be the case if we perform dumps on a slave. How people handle this in large environments?
About security, at first stage the VMs will be in a public cloud and the dedicated servers won't be hosted by the same provider. So I need to secure the communication between Zabbix server and VMs. I know 3 ways to do this :
- patch zabbix
- proxy + stunnel
- ssh checks
I won't patch Zabbix as I want to use the packages (1.8.11) provided by the linux distribution we will use in order to benefit security fixs. And I don't have the ability to do this myself.
The proxy could also be a VM hosted by the same provider than the target VMs to monitor. Depending on the provider (Rackspace for example) we could use a private network for the communication between the proxy and the VMs but it's still in clear. So could we also make ssh checks from a proxy?
Or ssh checks could be launch directly by the server.
Maybe that you have some advises about security?
About scalability, I seems better to let VMs schedule checks themselves but in that case we can't crypt communication (can't use ssh checks). So have you some advices to deal with both security and scalability? Is it possible to use proxies with ssh active checks?
About HA, you can see here https://www.zabbix.com/forum/showthread.php?t=39058, I'm also looking for advices in order to avoid SPOF with proxies.
I would be really glad if you can tell me about your experience and I hope I was clear enough as english is not my native language.
Thanks in advance for your help.
Regards,
Jérémy
I am searching for advices and best practices about Zabbix related to HA, scalability & security.
I'm working on a project for which we will have to automatically deploy several VM on a IAAS provider. At the start we will only have to manage hundreds of virtual servers but I hope we could increase to thousands hosts over the next years.
I plan to install infra services (monitoring, conifg management, databases) on dedicated servers.
- 2 Zabbix servers (active/passive)
Intel Xeon W3520
24 Go DDR3
SOFT RAID1 on SATA2 disks
- 2 DB servers MySQL master/slave replication
Intel Xeon E5-1620
64 Go DDR3 ECC
SOFT RAID1 on 2x120Go SSD
I've read docs about MySQL replication based on DRBD but I think that while we perform backup on database it will impact the production performance. It won't be the case if we perform dumps on a slave. How people handle this in large environments?
About security, at first stage the VMs will be in a public cloud and the dedicated servers won't be hosted by the same provider. So I need to secure the communication between Zabbix server and VMs. I know 3 ways to do this :
- patch zabbix
- proxy + stunnel
- ssh checks
I won't patch Zabbix as I want to use the packages (1.8.11) provided by the linux distribution we will use in order to benefit security fixs. And I don't have the ability to do this myself.
The proxy could also be a VM hosted by the same provider than the target VMs to monitor. Depending on the provider (Rackspace for example) we could use a private network for the communication between the proxy and the VMs but it's still in clear. So could we also make ssh checks from a proxy?
Or ssh checks could be launch directly by the server.
Maybe that you have some advises about security?
About scalability, I seems better to let VMs schedule checks themselves but in that case we can't crypt communication (can't use ssh checks). So have you some advices to deal with both security and scalability? Is it possible to use proxies with ssh active checks?
About HA, you can see here https://www.zabbix.com/forum/showthread.php?t=39058, I'm also looking for advices in order to avoid SPOF with proxies.
I would be really glad if you can tell me about your experience and I hope I was clear enough as english is not my native language.
Thanks in advance for your help.
Regards,
Jérémy
Comment