Ad Widget

Collapse

MySQL server has gone away

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • befortin
    Member
    • Jul 2005
    • 48

    #1

    MySQL server has gone away

    Hi,

    My Zabbix server service dies sometimes (maybe 1 or 2 times each day, I ain't sure). I get the following error in the zabbix server's log :

    004881:20060629:074714 Query::select hostid from hosts where host='swinfs'
    004881:20060629:074714 Query failed:MySQL server has gone away [2006]
    004589:20060629:074714 One server process died. Shutting down...
    004589:20060629:074714 ZABBIX server is down.
  • dantheman
    Senior Member
    • May 2006
    • 209

    #2
    I had this problem and someone suggested using myisamchk and see if the database was corrupt, in my case it was.. after repairing that, now I'm not having that problem anymore.

    Comment

    • befortin
      Member
      • Jul 2005
      • 48

      #3
      I've made a crontab to restart the service if it's down. I just found out that the process dies everyday at the same time exactly :

      Wed Jul 12 07:48:01 EDT 2006
      Wed Jul 12 07:49:01 EDT 2006
      Thu Jul 13 07:48:01 EDT 2006
      Fri Jul 14 07:48:01 EDT 2006

      Could it be because of the housekeeper, since I think it runs one time a day?

      dantheman : the myisamchk wasn't working with the files of the database... So I have done a "check table blablabla extended", on a MySQL prompt, for each table.

      Comment

      • befortin
        Member
        • Jul 2005
        • 48

        #4
        Never mind, this aint caused by the housekeeper, since it runs each 8 hours on my zabbix server...

        So any guess what could cause zabbix_server to crash each 24h00 precisely?

        Comment

        • dantheman
          Senior Member
          • May 2006
          • 209

          #5
          Do you have any checks that are set to run once every 24 hours?

          Comment

          • schneck
            Member
            • May 2006
            • 62

            #6
            Database!

            Originally posted by befortin
            So any guess what could cause zabbix_server to crash each 24h00 precisely?
            It's (probably) not ZABBIX which is crashing, but MySQL (which is usually being restarted by the mysqld_* shell scripts automatically, use ps(1) to check the startup time of your mysqld)

            What YOU can do:
            * check why MySQL dies and fix that

            What the ZABBIX developers can do:
            * make zabbix_server recover gracefully from database failures (ie, wait a few secs and reconnect after failure)
            * have zabbix_server support other databases (PostgreSQL will be supported again in 1.1.1, Alexei wrote in another thread)

            Comment

            • befortin
              Member
              • Jul 2005
              • 48

              #7
              schneck :

              You are right : I just checked with ps and MySQL has restarted this morning, so it's MySQL that's crashing, not Zabbix.

              Comment

              • kurt
                Junior Member
                • Aug 2005
                • 21

                #8
                I`m having a similar problem where Zabbix says it Lost Connection to the database, Then my Zabbix_server processes die.

                I Have done a mysqlcheck on the database, but every table reports OK, but it still dies at random times.

                It has to do something with mysql server, but i cant figure it out.

                Have you had any luck troubleshooting your mysql server?

                Comment

                • befortin
                  Member
                  • Jul 2005
                  • 48

                  #9
                  kurt :

                  I have done an extended check on each table from the MySQL prompt, and each table looks consistent.

                  Since my MySQL logs were empty, I have just activated the logs in the MySQL configuration file. Let's wait and see what's in the MySQL log file, the next time that MySQL is going to crash...

                  Comment

                  • xming
                    Junior Member
                    • Jul 2006
                    • 3

                    #10
                    Originally posted by schneck
                    What the ZABBIX developers can do:
                    * make zabbix_server recover gracefully from database failures (ie, wait a few secs and reconnect after failure)

                    Hi develpers,

                    is it possible to have this feature, maybe even buffering the INSERT until the DB is back.

                    I am running zabbix server and mysql on different hosts, everytime I have restarted mysqld I have to remember to restart zabbix-server on the other, otherwise there will be no monitoring

                    xming

                    Comment

                    • befortin
                      Member
                      • Jul 2005
                      • 48

                      #11
                      My error log...

                      I now have the logs of MySQL and I'm trying to find out why it's crashing. Maybe that someone here will be able to help me...

                      Here's my MySQL error log :

                      Code:
                      Jul 25 07:48:17 localhost mysqld[4661]: mysqld got signal 11;
                      Jul 25 07:48:17 localhost mysqld[4661]: This could be because you hit a bug. It is also possible that this binary
                      Jul 25 07:48:17 localhost mysqld[4661]: or one of the libraries it was linked against is corrupt, improperly built,
                      Jul 25 07:48:17 localhost mysqld[4661]: or misconfigured. This error can also be caused by malfunctioning hardware.
                      Jul 25 07:48:17 localhost mysqld[4661]: We will try our best to scrape up some info that will hopefully help diagnose
                      Jul 25 07:48:17 localhost mysqld[4661]: the problem, but since we have already crashed, something is definitely wrong
                      Jul 25 07:48:17 localhost mysqld[4661]: and this may fail.
                      Jul 25 07:48:17 localhost mysqld[4661]: 
                      Jul 25 07:48:17 localhost mysqld[4661]: key_buffer_size=16777216
                      Jul 25 07:48:17 localhost mysqld[4661]: read_buffer_size=131072
                      Jul 25 07:48:17 localhost mysqld[4661]: max_used_connections=20
                      Jul 25 07:48:17 localhost mysqld[4661]: max_connections=100
                      Jul 25 07:48:17 localhost mysqld[4661]: threads_connected=16
                      Jul 25 07:48:17 localhost mysqld[4661]: It is possible that mysqld could use up to 
                      Jul 25 07:48:17 localhost mysqld[4661]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 233983 K
                      Jul 25 07:48:17 localhost mysqld[4661]: bytes of memory
                      Jul 25 07:48:17 localhost mysqld[4661]: Hope that's ok; if not, decrease some variables in the equation.
                      Jul 25 07:48:17 localhost mysqld[4661]: 
                      Jul 25 07:48:17 localhost mysqld[4661]: thd=0xb0b01dd0
                      Jul 25 07:48:17 localhost mysqld[4661]: Attempting backtrace. You can use the following information to find out
                      Jul 25 07:48:17 localhost mysqld[4661]: where mysqld died. If you see no messages after this, something went
                      Jul 25 07:48:17 localhost mysqld[4661]: terribly wrong...
                      Jul 25 07:48:17 localhost mysqld[4661]: Cannot determine thread, fp=0xb0d89728, backtrace may not be correct.
                      Jul 25 07:48:17 localhost mysqld[4661]: Stack range sanity check OK, backtrace follows:
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x8189e29
                      Jul 25 07:48:17 localhost mysqld[4661]: 0xffffe420
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x2
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x81fa679
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x81fb7d9
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x8197e0b
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x81a3324
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x81a3fc2
                      Jul 25 07:48:17 localhost mysqld[4661]: 0x81a4970
                      Jul 25 07:48:17 localhost mysqld[4661]: 0xb7eda341
                      Jul 25 07:48:17 localhost mysqld[4661]: 0xb7d2b4ee
                      Jul 25 07:48:17 localhost mysqld[4661]: New value of fp=(nil) failed sanity check, terminating stack trace!
                      Jul 25 07:48:17 localhost mysqld[4661]: Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
                      Jul 25 07:48:17 localhost mysqld[4661]: stack trace is much more helpful in diagnosing the problem, so please do 
                      Jul 25 07:48:17 localhost mysqld[4661]: resolve it
                      Jul 25 07:48:17 localhost mysqld[4661]: Trying to get some variables.
                      Jul 25 07:48:17 localhost mysqld[4661]: Some pointers may be invalid and cause the dump to abort...
                      Jul 25 07:48:17 localhost mysqld[4661]: thd->query at (nil)  is invalid pointer
                      Jul 25 07:48:17 localhost mysqld[4661]: thd->thread_id=35997
                      Jul 25 07:48:17 localhost mysqld[4661]: The manual page at http://www.mysql.com/doc/en/Crashing.html contains
                      Jul 25 07:48:17 localhost mysqld[4661]: information that should help you find out what is causing the crash.
                      Jul 25 07:48:17 localhost mysqld[4661]: pure virtual method called
                      Jul 25 07:48:17 localhost mysqld[4661]: terminate called without an active exception
                      Jul 25 07:48:17 localhost mysqld[4661]: Fatal signal 6 while backtracing
                      Jul 25 07:48:17 localhost mysqld_safe[31493]: Number of processes running now: 0
                      Jul 25 07:48:17 localhost mysqld_safe[31495]: restarted
                      I saw on http://dev.mysql.com/doc/refman/5.0/...ack-trace.html how to use the stack trace. Here's the result of resolve_stack_dump, using the stack trace :

                      Code:
                      0x8189e29 handle_segfault + 639
                      0xffffe420 _end + -140736912
                      0x2 (?)
                      0x81fa679 _ZN9MYSQL_LOG22purge_logs_before_dateEl + 95
                      0x81fb7d9 _ZN9MYSQL_LOG16rotate_and_purgeEj + 203
                      0x8197e0b _Z20reload_acl_and_cacheP3THDmP13st_table_listPb + 465
                      0x81a3324 _Z16dispatch_command19enum_server_commandP3THDPcj + 2436
                      0x81a3fc2 _Z10do_commandP3THD + 134
                      0x81a4970 handle_one_connection + 2240
                      0xb7eda341 _end + -1349892719
                      0xb7d2b4ee _end + -1351657666
                      I have no clue how to understand where the problem lies...

                      Comment

                      • befortin
                        Member
                        • Jul 2005
                        • 48

                        #12
                        Forgot another log...

                        Here's the content of my mysql.log file, just before mysqld dies :

                        Code:
                        060725  7:48:17      19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=17789
                                             19 Query       select 0,lastvalue from functions where functionid=11702
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,17741,282.620030)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=17741 and clock=1153825200
                                             19 Query       update trends set num=574, value_min=0.000000, value_avg=154.841399, value_max=875.552100 where itemid=17741 and clock=1153825200
                                             19 Query       update items set nextcheck=1153828100,prevvalue=lastvalue,lastvalue='282.620030',lastclock=1153828097 where itemid=17741
                                             19 Query       select distinct function,parameter,itemid,lastvalue from functions where itemid=17741
                                             19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=17741
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,18190,0.183333)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=18190 and clock=1153825200
                                             19 Query       update trends set num=194, value_min=0.133300, value_avg=0.627797, value_max=4.566700 where itemid=18190 and clock=1153825200
                                             19 Query       update items set nextcheck=1153828110,prevvalue=lastvalue,lastvalue='0.183333',lastclock=1153828097 where itemid=18190
                                             19 Query       select distinct function,parameter,itemid,lastvalue from functions where itemid=18190
                                             19 Query       select min(value) from history where clock>1153817297 and itemid=18190
                                             19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=18190
                                             19 Query       select 0,lastvalue from functions where functionid=11711
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,17744,0.000000)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=17744 and clock=1153825200
                                             19 Query       update trends set num=575, value_min=0.000000, value_avg=1.763827, value_max=278.042100 where itemid=17744 and clock=1153825200
                                             19 Query       update items set nextcheck=1153828100,lastclock=1153828097 where itemid=17744
                                             19 Query       select distinct function,parameter,itemid,lastvalue from functions where itemid=17744
                                             19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=17744
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,18123,0.000000)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=18123 and clock=1153825200
                                             19 Query       update trends set num=565, value_min=0.000000, value_avg=0.029947, value_max=3.770000 where itemid=18123 and clock=1153825200
                                             19 Query       update items set nextcheck=1153828100,lastclock=1153828097 where itemid=18123
                                             19 Query       select distinct function,parameter,itemid,lastvalue from functions where itemid=18123
                                             19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=18123
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,17895,37495.784866)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=17895 and clock=1153825200
                                             19 Query       update trends set num=566, value_min=11903.237900, value_avg=226938.273660, value_max=2978391.622700 where itemid=17895 and clock=1153825200
                                             19 Query       update items set nextcheck=1153828100,prevvalue=lastvalue,lastvalue='37495.784866',lastclock=1153828097 where itemid=17895
                                             19 Query       select distinct function,parameter,itemid,lastvalue from functions where itemid=17895
                                             19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=17895
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,17715,0.633333)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=17715 and clock=1153825200
                                             19 Query       update trends set num=194, value_min=0.200000, value_avg=0.461291, value_max=1.100000 where itemid=17715 and clock=1153825200
                                             19 Query       update items set nextcheck=1153828110,prevvalue=lastvalue,lastvalue='0.633333',lastclock=1153828097 where itemid=17715
                                             19 Query       select distinct function,parameter,itemid,lastvalue from functions where itemid=17715
                                             19 Query       select min(value) from history where clock>1153817297 and itemid=17715
                                             19 Query       select distinct t.triggerid,t.expression,t.status,t.dep_level,t.priority,t.value,t.description from triggers t,functions f,items i where i.status<>3 and i.it
                        emid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid=17715
                                             19 Query       select 0,lastvalue from functions where functionid=11700
                                             19 Query       insert into history (clock,itemid,value) values (1153828097,17734,0.000000)
                                             19 Query       select num,value_min,value_avg,value_max from trends where itemid=17734 and clock=1153825200
                                             19 Query       update trends set num=564, value_min=0.000000, value_avg=314.518353, value_max=18749.514200 where itemid=17734 and clock=1153825200

                        Comment

                        • befortin
                          Member
                          • Jul 2005
                          • 48

                          #13
                          I tried to run all last query that MySQL received (those in my previous post) and it has run successfully. So I guess that the crash isn't related to the queries that MySQL receives?

                          Comment

                          • xming
                            Junior Member
                            • Jul 2006
                            • 3

                            #14
                            Originally posted by befortin
                            Code:
                            Jul 25 07:48:17 localhost mysqld[4661]: mysqld got signal 11;
                            Jul 25 07:48:17 localhost mysqld[4661]: This could be because you hit a bug. It is also possible that this binary
                            Jul 25 07:48:17 localhost mysqld[4661]: or one of the libraries it was linked against is corrupt, improperly built,
                            Jul 25 07:48:17 localhost mysqld[4661]: or misconfigured. This error can also be caused by malfunctioning hardware.
                            Jul 25 07:48:17 localhost mysqld[4661]: We will try our best to scrape up some info that will hopefully help diagnose
                            Jul 25 07:48:17 localhost mysqld[4661]: the problem, but since we have already crashed, something is definitely wrong
                            This could be a mysql bug but every unlikely, which version are you running? What distro are you using? Is it a self compiled mysql?

                            The more possible reason for the crash IMO is harddware related, since you cannot reproduced the crash, it indicates more that it's not a software bug.

                            Do you overclock? If yes try run everything on normal speed, check your memory with memtest and your CPU with burn CPU.

                            just my 2 cents

                            xming

                            ----

                            Comment

                            • befortin
                              Member
                              • Jul 2005
                              • 48

                              #15
                              xming :
                              Code:
                              $ mysql -V
                              mysql  Ver 14.12 Distrib 5.0.22, for pc-linux-gnu (i486) using readline 5.1
                              Distro : Ubuntu LTS

                              Self-compiled ? : No

                              I just looked in my /var/log/messages, and I saw that 10 seconds after MySQL crashed, something else seems to have crashed :
                              Code:
                              [...]
                              Jul 26 07:48:24 localhost exiting on signal 15
                              Jul 26 07:48:25 localhost syslogd 1.4.1#17ubuntu7: restart.
                              [...]
                              I don't really understand what those errors means for now, but they appear every day at the same time as MySQL crashes.

                              I'll have to test hardware when I get some time to run the diagnostics I guess...
                              Last edited by befortin; 26-07-2006, 22:04.

                              Comment

                              Working...