Hello,
I installed Zabbix 1.1beta 5 yesterday after using Zabbix 1.0 for a long time now. I configured it to check two Windows-Hosts and some network-connections via simple checks (icmpping and icmppingsec). This is working fine.
I also added Triggers "Ping to server {HOSTNAME} is down" with {HOST:icmpping.avg(60)}=0 and set Severity to Average and Status to Enabled (using a Template for the 10 network connections I want to check, so all are the same).
Zabbix is generating Events when a network connection is down, so the Trigger is working. But I want to be informed via mail whenever a connection is down. So I added a new Action, added a condition Trigger = Host: Ping to server {HOSTNAME} is down and a second condition Trigger value = "ON".
I set my mail-address and the mailserver in Zabbix and tested the mailtransfer via telnet allready successfully. In Monitoring => Actions there are No alerts.
Can anybody tell me where I made a mistake?
Another thing:
I'm using Zabbix on an FC4 system (PIV without HT) with all updates installed. Configure and make install worked fine. The zabbix_server process worked for 10 hours fine but crashed this morning.
001994:20060111:073910 Timeout while connecting to [ws-2]
001994:20060111:073910 Host [ws-2] will be checked after [60] seconds
001994:20060111:074013 No route to host [ws-1]
001994:20060111:074013 Host [ws-1] will be checked after [60] seconds
001994:20060111:074018 Timeout while connecting to [ws-2]
001994:20060111:074018 Host [ws-2] will be checked after [60] seconds
001994:20060111:074121 No route to host [ws-1]
001994:20060111:074121 Host [ws-1] will be checked after [60] seconds
001994:20060111:074126 Timeout while connecting to [ws-2]
001994:20060111:074126 Host [ws-2] will be checked after [60] seconds
001999:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001999:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001988:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001989:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001990:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001993:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001994:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001967:20060111:074230 One child process died. Exiting ...
001967:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001995:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001995:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001996:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001996:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001997:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001997:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001998:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001998:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
002731:20060111:083022 Starting zabbix_server. ZABBIX 1.1beta5.
Before there were some timeouts checking the two windows hosts because they were shut down. I don't know if that was the cause of the problem.
I created a script for restarting zabbix_server whenever it dies and added it to crontab:
#!/bin/bash
A=`ps -aux |grep "/usr/local/bin/zabbix_server" -o -c`
if [ $A -lt 2 ]; then
rm /var/tmp/zabbix_server.pid
/etc/init.d/zabbixd start
fi
I will keep an eye on the server to see if it dies multiple times a day or if it was a one time crash.
Thank you
I installed Zabbix 1.1beta 5 yesterday after using Zabbix 1.0 for a long time now. I configured it to check two Windows-Hosts and some network-connections via simple checks (icmpping and icmppingsec). This is working fine.
I also added Triggers "Ping to server {HOSTNAME} is down" with {HOST:icmpping.avg(60)}=0 and set Severity to Average and Status to Enabled (using a Template for the 10 network connections I want to check, so all are the same).
Zabbix is generating Events when a network connection is down, so the Trigger is working. But I want to be informed via mail whenever a connection is down. So I added a new Action, added a condition Trigger = Host: Ping to server {HOSTNAME} is down and a second condition Trigger value = "ON".
I set my mail-address and the mailserver in Zabbix and tested the mailtransfer via telnet allready successfully. In Monitoring => Actions there are No alerts.
Can anybody tell me where I made a mistake?
Another thing:
I'm using Zabbix on an FC4 system (PIV without HT) with all updates installed. Configure and make install worked fine. The zabbix_server process worked for 10 hours fine but crashed this morning.
001994:20060111:073910 Timeout while connecting to [ws-2]
001994:20060111:073910 Host [ws-2] will be checked after [60] seconds
001994:20060111:074013 No route to host [ws-1]
001994:20060111:074013 Host [ws-1] will be checked after [60] seconds
001994:20060111:074018 Timeout while connecting to [ws-2]
001994:20060111:074018 Host [ws-2] will be checked after [60] seconds
001994:20060111:074121 No route to host [ws-1]
001994:20060111:074121 Host [ws-1] will be checked after [60] seconds
001994:20060111:074126 Timeout while connecting to [ws-2]
001994:20060111:074126 Host [ws-2] will be checked after [60] seconds
001999:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001999:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001988:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001989:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001990:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001993:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001994:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001967:20060111:074230 One child process died. Exiting ...
001967:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001995:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001995:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001996:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001996:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001997:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001997:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
001998:20060111:074230 Got QUIT or INT or TERM or PIPE signal. Exiting...
001998:20060111:074230 Cannot remove PID file [/var/tmp/zabbix_server.pid] [No such file or directory]
002731:20060111:083022 Starting zabbix_server. ZABBIX 1.1beta5.
Before there were some timeouts checking the two windows hosts because they were shut down. I don't know if that was the cause of the problem.
I created a script for restarting zabbix_server whenever it dies and added it to crontab:
#!/bin/bash
A=`ps -aux |grep "/usr/local/bin/zabbix_server" -o -c`
if [ $A -lt 2 ]; then
rm /var/tmp/zabbix_server.pid
/etc/init.d/zabbixd start
fi
I will keep an eye on the server to see if it dies multiple times a day or if it was a one time crash.
Thank you
.
Comment