Greetings,
I recently began looking working on a massive NTP issue on the server that I monitor, and one of the thoughts that I had was 'How would I use Zabbix to monitor time Sync?' Looking through the trigger functions, I saw the time() function, but notice that it is fairly rigid in it's timing. Due to things like network latency and, more importantly in my case, CPU utilization it is very possible for the time between the Zabbix server and the monitored clients to be off by as much as 60 seconds.
In my situation, I have network latency to worry about, but I also have 150 virtual machines running on 22 real machines. Unfortunately the time sync on these 150 virtual machines is the main issue. In my case the virtual servers can be no more than 45 seconds apart. To that end, I have built a fairly large peered NTP cluster. So far this is keeping the virtual server sync fairly well, however, I still need the ability to monitor it.
I created the following UserParam entry;
I then compare that value with the value of the time function on the zabbix server and need to alert if it goes beyond say 30 seconds. I know how to do this with two different trigger functions, but instead I would like to request a pretty simple feature: FuzzyTime.
FuzzyTime would test variance against the returned time, say up to 30 seconds variance. This would then allow people to monitor time sync and not trigger on variations below the 30 second threshold.
I recently began looking working on a massive NTP issue on the server that I monitor, and one of the thoughts that I had was 'How would I use Zabbix to monitor time Sync?' Looking through the trigger functions, I saw the time() function, but notice that it is fairly rigid in it's timing. Due to things like network latency and, more importantly in my case, CPU utilization it is very possible for the time between the Zabbix server and the monitored clients to be off by as much as 60 seconds.
In my situation, I have network latency to worry about, but I also have 150 virtual machines running on 22 real machines. Unfortunately the time sync on these 150 virtual machines is the main issue. In my case the virtual servers can be no more than 45 seconds apart. To that end, I have built a fairly large peered NTP cluster. So far this is keeping the virtual server sync fairly well, however, I still need the ability to monitor it.
I created the following UserParam entry;
Code:
UserParameter=custom[timecheck],/bin/date +%H%M%S
FuzzyTime would test variance against the returned time, say up to 30 seconds variance. This would then allow people to monitor time sync and not trigger on variations below the 30 second threshold.