PDA

View Full Version : mysql goes crazy


donavan
08-10-2004, 08:47
About once per 30 hours, mysql goes crazy on the box running mysql and all 3 zabbix tools (the server it self is monitored).

This mysql is only used by zabbix. A 'service mysqld restart' clears the problem and things return to normal.

There doesn't seem to be a pattern or anything to really get around on this.

This is Centos 3.3 box (Redhat Enterprise Linux 3 Update 3 clone) server that does little else except run zabbix. It's a dual process Athlon MP box with 1 gig of memory and 1.5 gigs of swap.

I've debugged with centos people in the past and looked at numerous outputs from numerous tools. Nothing points to a problem.

Any Suggestions?

For example (I caught this one at the start):

[root@cumulus home]# w
00:15:06 up 13 days, 1:15, 3 users, load average: 1.86, 2.12, 2.01

[root@cumulus home]# service mysqld restart
Stopping MySQL: [ OK ]
Starting MySQL: [ OK ]

[root@cumulus home]# w
00:25:46 up 13 days, 1:26, 3 users, load average: 0.14, 1.17, 1.65

Alexei
08-10-2004, 09:16
Why do you think the MySQL server goes crazy? Try to run mysqladmin processlist to see what exactly MySQL is doing.

donavan
08-10-2004, 20:05
because restarting mysql fixes the problem? I'll try the mysqladmin command next time it occurs.

donavan
15-10-2004, 07:25
Here it is. I don't know mysql to know what these mean.

[root@cumulus root]# mysqladmin processlist
+-------+--------+-----------+--------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+--------+-----------+--------+---------+------+-------+------------------+
| 12 | zabbix | localhost | zabbix | Sleep | 1 | | |
| 34343 | zabbix | localhost | zabbix | Sleep | 191 | | |
| 34364 | zabbix | localhost | zabbix | Sleep | 182 | | |
| 34539 | zabbix | localhost | zabbix | Sleep | 174 | | |
| 35033 | zabbix | localhost | zabbix | Sleep | 183 | | |
| 45582 | zabbix | localhost | zabbix | Sleep | 191 | | |
| 45636 | zabbix | localhost | zabbix | Sleep | 190 | | |
| 45638 | zabbix | localhost | zabbix | Sleep | 176 | | |
| 45684 | root | localhost | | Query | 0 | | show processlist |
+-------+--------+-----------+--------+---------+------+-------+------------------+


[root@cumulus root]# ps -ef |grep zabbix
zabbix 1102 1 0 Sep24 ? 00:18:48 /home/zabbix/bin/zabbix_agentd
zabbix 1104 1102 0 Sep24 ? 00:07:51 /home/zabbix/bin/zabbix_agentd
zabbix 1105 1102 0 Sep24 ? 00:07:50 /home/zabbix/bin/zabbix_agentd
zabbix 1106 1102 0 Sep24 ? 00:07:48 /home/zabbix/bin/zabbix_agentd
zabbix 1107 1102 0 Sep24 ? 00:07:51 /home/zabbix/bin/zabbix_agentd
zabbix 1108 1102 0 Sep24 ? 00:07:49 /home/zabbix/bin/zabbix_agentd
zabbix 1238 1 0 Sep24 ? 00:00:00 /home/zabbix/bin/zabbix_trapperd
zabbix 1240 1238 0 Sep24 ? 00:00:00 /home/zabbix/bin/zabbix_trapperd
zabbix 1241 1238 0 Sep24 ? 00:00:00 /home/zabbix/bin/zabbix_trapperd
zabbix 1242 1238 0 Sep24 ? 00:00:00 /home/zabbix/bin/zabbix_trapperd
zabbix 1243 1238 0 Sep24 ? 00:00:00 /home/zabbix/bin/zabbix_trapperd
zabbix 1244 1238 0 Sep24 ? 00:00:00 /home/zabbix/bin/zabbix_trapperd
zabbix 16904 1 0 Oct08 ? 00:00:11 /home/zabbix/bin/zabbix_suckerd
zabbix 16910 16904 0 Oct08 ? 00:00:04 /home/zabbix/bin/zabbix_suckerd
zabbix 16911 16904 0 Oct08 ? 00:00:04 /home/zabbix/bin/zabbix_suckerd
zabbix 16912 16904 0 Oct08 ? 00:00:07 /home/zabbix/bin/zabbix_suckerd
zabbix 16913 16904 0 Oct08 ? 01:07:12 /home/zabbix/bin/zabbix_suckerd

root@cumulus root]# ps -ef |grep mysql
root 16481 1 0 Oct08 ? 00:00:00 /bin/sh /usr/bin/safe_mysqld --defaults-file=/etc/my.cnf
mysql 16506 16481 2 Oct08 ? 04:26:34 /usr/libexec/mysqld --defaults-file=/etc/my.cnf --basedir=/usr --datadir=/var/lib/mysql/data --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-locking

[root@cumulus root]# uptime
23:00:31 up 20 days, 0 min, 1 user, load average: 2.70, 3.13, 2.96
[root@cumulus root]# uptime
23:00:33 up 20 days, 0 min, 1 user, load average: 2.88, 3.16, 2.97
[root@cumulus root]# uptime
23:00:39 up 20 days, 0 min, 1 user, load average: 3.05, 3.19, 2.98


this is top then u mysql H

23:03:04 up 20 days, 3 min, 1 user, load average: 3.91, 3.44, 3.10
100 processes: 99 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 2.6% 0.0% 2.4% 0.0% 0.4% 13.4% 180.2%
cpu00 2.5% 0.0% 1.3% 0.1% 0.5% 7.1% 88.0%
cpu01 0.1% 0.0% 1.1% 0.0% 0.0% 6.3% 92.2%
Mem: 1026712k av, 1009032k used, 17680k free, 0k shrd, 385272k buff
776004k actv, 132044k in_d, 19708k in_c
Swap: 1445808k av, 75080k used, 1370728k free 500256k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
16915 mysql 15 0 11528 11M 1536 S 4.1 1.0 240:16 0 mysqld
16506 mysql 15 0 11528 11M 1536 S 0.0 1.0 0:05 0 mysqld
16507 mysql 15 0 11528 11M 1536 S 0.0 1.0 0:01 0 mysqld
9302 mysql 16 0 11528 11M 1536 S 0.0 1.0 0:35 1 mysqld
9405 mysql 16 0 11528 11M 1536 S 0.0 1.0 0:36 1 mysqld
10215 mysql 16 0 11528 11M 1536 S 0.0 1.0 0:36 0 mysqld
12536 mysql 15 0 11528 11M 1536 S 0.0 1.0 0:34 1 mysqld
725 mysql 15 0 11528 11M 1536 S 0.0 1.0 0:00 1 mysqld
980 mysql 21 0 11528 11M 1536 S 0.0 1.0 0:00 0 mysqld
982 mysql 16 0 11528 11M 1536 S 0.0 1.0 0:00 1 mysqld


I will post a link to the zabbix graph that results when I get home (need my zabbix password :) ).

Alexei
15-10-2004, 08:34
I see absolutely nothing wrong. Please, answer my previous question. I still do not undertand what problem you're talking about. :confused:

donavan
07-11-2004, 08:57
Why do you think the MySQL server goes crazy?

Because CPU usage on the box goes crazy and restarting MYSQL makes the problem go away.

Now since I'm back to this issue as it still happens I have done the following:

box A is a mysql server hosting several mysql databases including zabbix's
box B is now running the web part of zabbix as well as all 3 zabbix_ binaries.

When the CPU consumption issue returns I will return here with an update if it occurs on box A or box B.

FYI, box A has been running without issue for several weeks.

donavan
08-11-2004, 05:40
Alexei,

The following graph shows what happened when I moved my zabbix database to different server (7 day graph).

http://4wx.net/zab/alto_load_post_zabbix.png

The spike at 11.07 00:20 is when I was tar xzvf the database file from the old server onto the new server. This would be on Box A from my previous email.

The following is the CPU load over the last 24 hours.

http://4wx.net/zab/alto_last24hours.png

Any new thoughts?