The article assumes that you are using a single *nix Zabbix Server to monitor distributed DNS and/or NTP services running on or accessible from your network. Since I have not figured out a way to run external scripts tied to Items and Triggers from the server, I use zabbix_agentd to run these external checks. The scripts that execute the checks are PHP scripts based on the HOST (for DNS), and NTPQ (for NTP) commands.
You may place these scripts wherever it is easiest for you. I placed them in /var/local/www/data/zabbix/scripts.
#!/usr/local/bin/php <?php // Define defaults $result=0; if($_SERVER["argv"][1]) { $ns_server = $_SERVER["argv"][1]; } else { echo "You need to supply a DNS server to check. Quitting.\n"; exit; } $hosts = array("helpdesk", "ns1.nmsu.edu"); // Do query foreach($hosts as $host) { if(shell_exec("host ".$host." ".$ns_server." | grep 'has address' | wc -l")==0) { $result= $result+0; // success } else { $result= $result+1; // failure } } if($result > 0) { $result=1; } else { $result=0; } echo $result; ?>
Same script, but in bash.
#!/bin/bash timeout=2 host="/usr/bin/host" if test -z "$1" ; then echo "You need to supply a DNS server to check. Quitting" exit; fi SERVER=$1 if test -n "$2" ; then Q=$2 else Q="yandex.ru" fi ERC=`$host -s -W $timeout $Q $SERVER > /dev/null 2>&1; echo $?` if [ $ERC -eq 0 ] ; then echo 1 else echo 0 fi
(Two or more lookups may be used to test for various DNS lookup scenarios, e.g. referrals, reverse lookups.)
#!/usr/local/bin/php <?php // Define defaults $result=0; if($_SERVER["argv"][1]) { $ntp_server = $_SERVER["argv"][1]; } else { echo "You need to supply an NTP server to check. Quitting.\n"; exit; } // Do query if(shell_exec("ntpq -pn ".$ntp_server." | grep -E -c '^\*'")==1) { $result= 1; // success } else { $result= 0; // failure } echo $result; ?>
Be sure and set your scripts to the proper owner (e.g. www) and use chmod +x to make them executable.
These two lookups may take many seconds to complete and return a value. While they generally respond in less than a second for a successful query, a timeout response may take more than 6 seconds. Before we complete the setup we will extend the timeout period for both zabbix_server and zabbix_agentd so that we will always get some sort of response under normal circumstances.
/var/log/zabbix_agentd.log file. Increasing the agent timeout resolved this problem.
Edit /etc/zabbix/zabbix_agentd.conf. Set Timeout=20.
Add the following DNS tests, assuming your path to the scripts:
UserParameter=DNSbr1,php /usr/local/www/data/zabbix/scripts/dnschk.php 192.168.1.10 UserParameter=DNSbr6,php /usr/local/www/data/zabbix/scripts/dnschk.php 192.168.6.10 UserParameter=DNSbr4,php /usr/local/www/data/zabbix/scripts/dnschk.php 192.168.4.10
And for NTP:
UserParameter=NTPs1,php /usr/local/www/data/zabbix/scripts/ntpchk.php 192.168.1.68
You may have as many such tests as you want. Just keep track of the names for when you set up your Triggers in Zabbix.
In order to load your new agent configuration, use ps aux to find the PID of your zabbix_agentd: main process and kill it. Then start the agent again:
>cd /usr/local/bin >./zabbix_agentd
(If you need to troubleshoot the agent process for any reason, you should take care to set the log path, owner and permissions to write to the /var/log/zabbix_agent.log.)
Assuming that you are about ready to set up triggers, you must now change the default timeout for Zabbix_server. It was set to 3 (seconds) here, and so when lookups failed I was getting nothing (instead of 0) in my triggers. Edit /etc/zabbix/zabbix_server.conf to set timeout=20.
Kill zabbix_server (sleeping…|main…) and then use ./zabbix_server to start it again with your new values.
Time to switch to the Zabbix web interface. Login as an administrator, then go to Configuration → Hosts and create a host. I suggest the name “ExternalTests”. I typed in a new group “External” set Use IP address and typed in localhost. Port 10050 (or your configured port for the agent).
Having created a host entry for your client, from the Configuration, Hosts screen, click the Items link next to your new host. Click Create Item and give your test a name (e.g. DNS Branch 1). Select the type as Zabbix Agent. This is where you use the Name, or Key, that you configured in your agent for your tests. The first item in your UserParameter= statement is the “Key” used to query your agent. Type DNSbr1 (my example from above) as your key. Set the type to Numeric (Integer), and set your preferences for the other parameters.
Your Trigger(s) will be based on values recorded by this Item.
Go to (Configuration) Triggers and Create Trigger. Give your new Trigger a name like BR1 DNS Server and an expression like
{ExternalTests:DNSbr1.last(0)}=0
I set the Severity to Warning.
Repeat this for as many checks as you set up in the agentd configuration.
Now to set up an Action to warn you in case one of your services goes down. Go to (Configuration) Actions and Create Action. Select the Action type you want (I use Send message, and the media is a cell phone SMS service), Source must be a Trigger, I set two Conditions:
Host group = External Trigger name like "DNS Server"
Set your other options as desired. Since I am sending an email or SMS message I set the subject to ”{TRIGGER.NAME} Problem”, and the message to ”{TRIGGER.NAME} may be down as of {DATE} at {TIME}”.
Once created, this actions will trip when any of your monitored services return a “0”.
You may wish to make your triggers a bit smarter or a bit less sensitive depending on your environment or the load on the servers. E.g. a trigger of:
{ExternalTests:DNSbr1.sum(#3)}=>2
will trip after two out of the last three tests failed.
{ExternalTests:DNSbr1.sum(120)}<>0
will trip if any test in the last 120 seconds failed. I think that last one will only trip every 120 seconds in case of an on-going failure.