Over the weekend, I got an alert from my Zabbix server that the database was down. When I looked at the server, mysql was still running fine but I found this in my zabbix_server.log file:
6952:20090411:073148 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending password information', system error: 32
6801:20090411:073142 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending authentication information', system error: 32
6801:20090411:073227 Watchdog: Database is down
6952:20090411:073511 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending authentication information', system error: 32
6942:20090411:073732 Executing housekeeper
6952:20090411:074430 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading final connect information', system error: 104
6801:20090411:074712 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending authentication information', system error: 32
6801:20090411:075119 Watchdog: Database is down
6952:20090411:075914 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading authorization packet', system error: 104
6801:20090411:080733 One child process died. Exiting ...
6801:20090411:080743 ZABBIX Server stopped. ZABBIX 1.6.2.
The last entry in the log before this was that one of the parameters looking at a router didnt' work but I've got a LOT of these in the log:
6887:20090411:072610 Expression [{12944}>150000] for item [24301][xxxx xxx router:ifInOctets1] cannot be evaluated: unable to get function value: lastvalue IS NULL for function [12944][xxxx xxxx router:ifInOctets1.delta(0)]
I believe these are due to applying templates to a router that isn't using that particular interface (I put in X's instead of the router name) so I'm going through the log and disabling those tests in Zabbix so it will stop complaining about them. However, I don't have a clue if this is related to the problem of the Zabbix server shutting down.
When I came in this morning, mysql was running so I started the zabbix_server and everything seems to be fine now. So, the database wasn't down, according to the log maybe Zabbix lost contact with the database then shut down. I'm not sure what would cause the authentication problem to the database, I didn't change anything when I restarted it and everything started just fine.
I'm only using snmp and ping, no agents anywhere.
Any ideas why this would happen? I hate to resort to it but do I need to have a cron job to restart Zabbix periodically?
Thanks,
Kerry
PS Running Zabbix 1.6.2 on CentOS 5.2 on a dual Xeon with 4 gigs of ram.
Required server performance, new values per second 17.1938
load average: 0.64, 0.61, 0.51
6952:20090411:073148 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending password information', system error: 32
6801:20090411:073142 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending authentication information', system error: 32
6801:20090411:073227 Watchdog: Database is down
6952:20090411:073511 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending authentication information', system error: 32
6942:20090411:073732 Executing housekeeper
6952:20090411:074430 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading final connect information', system error: 104
6801:20090411:074712 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'sending authentication information', system error: 32
6801:20090411:075119 Watchdog: Database is down
6952:20090411:075914 [Z3001] Connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading authorization packet', system error: 104
6801:20090411:080733 One child process died. Exiting ...
6801:20090411:080743 ZABBIX Server stopped. ZABBIX 1.6.2.
The last entry in the log before this was that one of the parameters looking at a router didnt' work but I've got a LOT of these in the log:
6887:20090411:072610 Expression [{12944}>150000] for item [24301][xxxx xxx router:ifInOctets1] cannot be evaluated: unable to get function value: lastvalue IS NULL for function [12944][xxxx xxxx router:ifInOctets1.delta(0)]
I believe these are due to applying templates to a router that isn't using that particular interface (I put in X's instead of the router name) so I'm going through the log and disabling those tests in Zabbix so it will stop complaining about them. However, I don't have a clue if this is related to the problem of the Zabbix server shutting down.
When I came in this morning, mysql was running so I started the zabbix_server and everything seems to be fine now. So, the database wasn't down, according to the log maybe Zabbix lost contact with the database then shut down. I'm not sure what would cause the authentication problem to the database, I didn't change anything when I restarted it and everything started just fine.
I'm only using snmp and ping, no agents anywhere.
Any ideas why this would happen? I hate to resort to it but do I need to have a cron job to restart Zabbix periodically?
Thanks,
Kerry
PS Running Zabbix 1.6.2 on CentOS 5.2 on a dual Xeon with 4 gigs of ram.
Required server performance, new values per second 17.1938
load average: 0.64, 0.61, 0.51

)
Comment