Hi guys,
I have a medium sized Zabbix setup. We are an IT company with a few hundred managed servers. we'd like to start implementing web checks on these servers, but we're finding that we are getting a lot of flapping on the web checks reporting false positives.
Graphing of the web check is fine, but the actual content check is causing a trigger. We are looking for a string, and have a timeout of 30 seconds. Over both http and https.
The trigger is:
{hostname:web.test.fail[check-name].avg(#3,180)}=0
The check is every 60 seconds, so hoped the above would help, but it's still occurring.
here's our original trigger;
{hostname:web.test.fail[check-name].last(#3)}=1
According to the trigger stats, the content check is seeing a timeout of 3000ms which is what is causing the trigger to alert us.
Curling the sites are absolutely fine and I am not seeing any drop when curling every 10 seconds. Response time and ping on the site are perfectly fine, but when enabling the content check, this is when things go bad.
We're seeing a constant stream of emails, resolved, problem and so on.
Anyone able to offer advise on what to check, how to improve etc?
Not seeing anything in the zabbix server log with debug level 5, except for the content check timeout.
Thanks in advance
I have a medium sized Zabbix setup. We are an IT company with a few hundred managed servers. we'd like to start implementing web checks on these servers, but we're finding that we are getting a lot of flapping on the web checks reporting false positives.
Graphing of the web check is fine, but the actual content check is causing a trigger. We are looking for a string, and have a timeout of 30 seconds. Over both http and https.
The trigger is:
{hostname:web.test.fail[check-name].avg(#3,180)}=0
The check is every 60 seconds, so hoped the above would help, but it's still occurring.
here's our original trigger;
{hostname:web.test.fail[check-name].last(#3)}=1
According to the trigger stats, the content check is seeing a timeout of 3000ms which is what is causing the trigger to alert us.
Curling the sites are absolutely fine and I am not seeing any drop when curling every 10 seconds. Response time and ping on the site are perfectly fine, but when enabling the content check, this is when things go bad.
We're seeing a constant stream of emails, resolved, problem and so on.
Anyone able to offer advise on what to check, how to improve etc?
Not seeing anything in the zabbix server log with debug level 5, except for the content check timeout.
Thanks in advance