Would like to check what is the realiable parameter to use for monitoring server's availability? Is it "Host status" or "Agent Ping"?
Ad Widget
Collapse
Reliable parameter to monitor server availablity
Collapse
X
-
i'd say it depends on your environment and requirements.
i'm not voting, because i believe this might differ depending on these factors - one would have to understand what both do and decide based on that. -
This is a quote from Alexei regarding the use of Host Status:
The "Host status" is just a special item, which DOES NOT represent host status, it provide us with status of a passive agent (ZABBIX and SNMP). This is really important to understand!
Because of this, use of the status for a host availability trigger is not a good idea. Consider using of a combination on function nodata() for a reliable item with a TCP or ICMP ping, if possible. This is much more efficient and bullet proof method with "built-in" flap detection adjusted by nodata's period.Comment
-
Comment
-
How?
this sounds good, but how are you doing this?
Something like:
{template_sm:icmppingsec.nodata( 0 ) }=0?Comment
-
I use:Basically, it will tell me if the ping replies within 6 minutes average higher than 2 seconds (slow server responceCode:({Template_Windows:icmppingsec.avg(360)}>2)|({Template_Windows:icmppingsec.avg(360)}=0)
) OR if it replys with 0s (As you will always get at least 1ms, and Zabbix reports "0" instead of "nodata", go figure!) for 6 minutes. It does have a little room for error, i.e. a network interface is going up and down, but it seems to work for me. You can play with the formula a bit to fix that if you want.
Comment
-
ICMPPING vs Agent.Ping
Everyone seems to be using ICMPPING, what happened to the Agent Ping?
I would assume that the agent ping uses less resources on the Zabbix server as it is a build in ping (is it?) vs. the ICMPPING which is an external command.
How many ICMPPING can I realistically do on a 2 CPU server? I got ~ 1000 devices to monitor and I was hoping for a 1 minute interval. Reason being, rebotting a VM session often doesnt take more than 1 or 2 minutes, thus my current 5 minute interval often doesnt pick up the restart.Comment
-
Well, you could really use anything to monitor "status" of a server. Any value returned from a Zabbix host through the agent would suggest a "live" server. The real problem comes when you have varying levels of failure. i.e. The server is up and responding to pings, but a service has died. The use of pings simply is a check to see if the server is available on the network. Any real measure of "availlability" of any "server" is to monitor the particular service that server provides. i.e. ftp, web, terminal, smb, mail, dns, etc. etc. etc. So ultimately, it comes down to what YOU find most important to monitor, and then base your items and ultimately triggers on that.
PS: There is an option in the zabbix_server.conf file called "startpingers." Your realistic limit of these is based on your network and server specs. You'll just have to experiment.
Comment
-
picking up restarts reliably shouldn't be done by *ping checks anyway - see default templates for 'just restarted' check examplesComment

Comment