Hi there,
I’ve got a pair of Zabbix servers in a Master/Child node configuration, monitoring 11 host servers. I’ve had the servers monitored for several months and I’ve also had SLAs set up since the backend of last year. The problem that we have found, is that a few weeks ago one and then another of the servers started to report 100% problem time, even though there wasn’t a problem and the Status showed as OK.
I’ve tried everything short of deleting the SLA and starting over. We want to use this information for business planning and I can’t see deleting the SLA as fixing the problem in the long term as it may reoccur.
Here’s a picture of Status of OK and no reported problems but 100% problem time:

Here’s a picture of where 100% problem time started:
I’ve got a pair of Zabbix servers in a Master/Child node configuration, monitoring 11 host servers. I’ve had the servers monitored for several months and I’ve also had SLAs set up since the backend of last year. The problem that we have found, is that a few weeks ago one and then another of the servers started to report 100% problem time, even though there wasn’t a problem and the Status showed as OK.
I’ve tried everything short of deleting the SLA and starting over. We want to use this information for business planning and I can’t see deleting the SLA as fixing the problem in the long term as it may reoccur.
Here’s a picture of Status of OK and no reported problems but 100% problem time:

Here’s a picture of where 100% problem time started:
Comment