Hi,
We've deployed Zabbix to monitor our ~400 servers (mainly Solaris, Linux & Windows hosts, many of them are Oracle servers etc.).
we are pretty satisfied with it and will start using it for many things (not only basis OS/Middleware checks but many other app checks)
The confguration is using the zabbix appliance 3.4 for the moment (we plan to install a 4+ on RHEL7 soon) with a little tuning, as well as agents in 3.4 except for solaris using 3.2's.
The client configurations all have Timeouts defined to 30.
Atm we are only using Passive checks (as far as i know).
We are having some trouble on a regular basis with our SAN & some databases (slow hosts, queries etc.), thus generating a lot of "host unreachable" alerts on a regular basis, annoying us, Zabbix is also missing data for a lot of hosts on a regular basis (we see huge wholes for 10/15 minutes in graphs)
I was wondering if switching to Active Checks was a little better compared to Passive checks ?
If the agents themselves are forwarding the data, i guess at least we might get rid of huge timeouts and that type of host unreachable alerts ?
Also, is it a good practice in general for that amount of hosts (about 400) ?
How will the Zabbix server react ? and how should we "tune" it to receive all of this data from clients ? what happens if a client tries to send data to zabbix server but the server is not able to receive it as it does not have enough "receives" (or whatever you call them)
thank you
regards
We've deployed Zabbix to monitor our ~400 servers (mainly Solaris, Linux & Windows hosts, many of them are Oracle servers etc.).
we are pretty satisfied with it and will start using it for many things (not only basis OS/Middleware checks but many other app checks)
The confguration is using the zabbix appliance 3.4 for the moment (we plan to install a 4+ on RHEL7 soon) with a little tuning, as well as agents in 3.4 except for solaris using 3.2's.
The client configurations all have Timeouts defined to 30.
Atm we are only using Passive checks (as far as i know).
We are having some trouble on a regular basis with our SAN & some databases (slow hosts, queries etc.), thus generating a lot of "host unreachable" alerts on a regular basis, annoying us, Zabbix is also missing data for a lot of hosts on a regular basis (we see huge wholes for 10/15 minutes in graphs)
I was wondering if switching to Active Checks was a little better compared to Passive checks ?
If the agents themselves are forwarding the data, i guess at least we might get rid of huge timeouts and that type of host unreachable alerts ?
Also, is it a good practice in general for that amount of hosts (about 400) ?
How will the Zabbix server react ? and how should we "tune" it to receive all of this data from clients ? what happens if a client tries to send data to zabbix server but the server is not able to receive it as it does not have enough "receives" (or whatever you call them)
thank you
regards
Comment