Hello all...
We've had Zabbix running a number of days now, and have moved from test box to production, and we are VERY impressed with it. So respect to all the developers that have contributed to the project. Your work is appreciated. (Credit where credit is due)
The setup is roughly like this. We have created a template that is all simple checks for the likes of plesk, cpanel, http, https, pop3, smtp, ftp, etc, etc. The triggers notify us if any of those services goes down on any of the hosts.
Now, the problem mainly is with smtp and pop, that when using the likes of:
{Linux_Simple
op.prev(0)}#1
(NB: Was an 64bit num.int.)
that as so many people are checking their mail, and so much spam coming in via smtp, that Zabbix times out the connection and gives a non 1 value, resulting in an alert... so I tried this:
{Linux_Simple
op_perf.prev(0)}>7
(NB: Was a num.float)
This did help, ... saying if after 7 seconds the host did not respond, trigger an alert. But sometimes it still tiggered.
Ideally, we need to say this:
"Is the previous value > 7 and the current value > 7 ?? If so, trigger an alert"
This was we can base the alert happening that twice, the host has timed out after 7 seconds.
What interests us is what is the default time out of a simple test? Is it editable also? So if just using pop and not pop_perf, how does Zabbix decide that the hosts service has not responded..
Ok, thats it.. I await your answers
Regards to you all...
Entorno Digital EspaƱa
We've had Zabbix running a number of days now, and have moved from test box to production, and we are VERY impressed with it. So respect to all the developers that have contributed to the project. Your work is appreciated. (Credit where credit is due)
The setup is roughly like this. We have created a template that is all simple checks for the likes of plesk, cpanel, http, https, pop3, smtp, ftp, etc, etc. The triggers notify us if any of those services goes down on any of the hosts.
Now, the problem mainly is with smtp and pop, that when using the likes of:
{Linux_Simple
op.prev(0)}#1(NB: Was an 64bit num.int.)
that as so many people are checking their mail, and so much spam coming in via smtp, that Zabbix times out the connection and gives a non 1 value, resulting in an alert... so I tried this:
{Linux_Simple
op_perf.prev(0)}>7(NB: Was a num.float)
This did help, ... saying if after 7 seconds the host did not respond, trigger an alert. But sometimes it still tiggered.
Ideally, we need to say this:
"Is the previous value > 7 and the current value > 7 ?? If so, trigger an alert"
This was we can base the alert happening that twice, the host has timed out after 7 seconds.
What interests us is what is the default time out of a simple test? Is it editable also? So if just using pop and not pop_perf, how does Zabbix decide that the hosts service has not responded..
Ok, thats it.. I await your answers

Regards to you all...
Entorno Digital EspaƱa
Comment