Hija, I hope someone can shed some light on this misterious Zabbix behaviour.
Our Zabbix server is killing it self every now and again for no good reason. This ofcourse makes the application useless as a uptime monitoring tool. If I cant trust Zabbix to be running when I am sleeping then I cant trust him to tell me when my servers go down when Im not looking. I turned the logging to debug level and this is what I got:
031322:20060821:174250 Query::select i.itemid,i.key_,h.host,h.port,i.delay,... etc
031322:20060821:174250 Query failed:Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) [2002]
031312:20060821:174250 One server process died. Shutting down...
031312:20060821:174250 0. Killing PID=[31316]
031312:20060821:174250 1. Killing PID=[31317]
Firstly, why does it have to shut down over one failed query? Having been running fine for a few days one or two queries should not mean that it has to die.
Secondly, and this might be due to my own ignorance but I am not using the "# Connect to MySQL using Unix socket?" option in the zabbix_server config so why is it complaining about a socket connection failure?
What measures do people take to prevent this from happening? I noticed a thread before discussing crons and greps to check if the system is running, but monitoring the monitoring tool sounds a tad silly to me. Is there any way to make sure Zabbix stays up, even when it keeps committing suicide?
Our Zabbix server is killing it self every now and again for no good reason. This ofcourse makes the application useless as a uptime monitoring tool. If I cant trust Zabbix to be running when I am sleeping then I cant trust him to tell me when my servers go down when Im not looking. I turned the logging to debug level and this is what I got:
031322:20060821:174250 Query::select i.itemid,i.key_,h.host,h.port,i.delay,... etc
031322:20060821:174250 Query failed:Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) [2002]
031312:20060821:174250 One server process died. Shutting down...
031312:20060821:174250 0. Killing PID=[31316]
031312:20060821:174250 1. Killing PID=[31317]
Firstly, why does it have to shut down over one failed query? Having been running fine for a few days one or two queries should not mean that it has to die.
Secondly, and this might be due to my own ignorance but I am not using the "# Connect to MySQL using Unix socket?" option in the zabbix_server config so why is it complaining about a socket connection failure?
What measures do people take to prevent this from happening? I noticed a thread before discussing crons and greps to check if the system is running, but monitoring the monitoring tool sounds a tad silly to me. Is there any way to make sure Zabbix stays up, even when it keeps committing suicide?
Comment