Ad Widget

Collapse

MariaDB crashing occasionally during partition rotation

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • guzzijason
    Senior Member
    • Dec 2015
    • 106

    #1

    MariaDB crashing occasionally during partition rotation

    This has happened twice now in the past week or so.

    CentOS 7.2.1511
    MariaDB-server-10.1.16-1.el7
    zabbix-server-mysql-3.0.4-1.el7

    The zabbix server handles ~425 new values per second.

    This is part of a 2-node galera cluster. When the problem happens, I can only restart the database node after it has been re-synced from the other node (hooray for cluster replication, right?!).

    At any rate, the timing and query statement in the log below seem to indicate that this happens during the nightly table partition rotation job.

    Any DB experts out there have any insight as to what could be the problem here?

    Code:
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: 160922  3:00:04 [ERROR] mysqld got signal 11 ;
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: This could be because you hit a bug. It is also possible that this binary
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: or one of the libraries it was linked against is corrupt, improperly built,
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: We will try our best to scrape up some info that will hopefully help
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: diagnose the problem, but since we have already crashed,
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: something is definitely wrong and this may fail.
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Server version: 10.1.16-MariaDB
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: key_buffer_size=134217728
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: read_buffer_size=131072
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: max_used_connections=31
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: max_threads=153
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: thread_count=22
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: It is possible that mysqld could use up to
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467113 K  bytes of memory
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Hope that's ok; if not, decrease some variables in the equation.
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Thread pointer: 0x0x7f0b547f2008
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Attempting backtrace. You can use the following information to find out
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: where mysqld died. If you see no messages after this, something went
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: terribly wrong...
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: stack_bottom = 0x7f0c7dd63130 thread_stack 0x48400
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f0c8de86eee]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x2d5)[0x7f0c8d9ae265]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /lib64/libpthread.so.0(+0xf100)[0x7f0c8cfcc100]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z20ha_abort_transactionP3THDS0_c+0x9f)[0x7f0c8d9b8d4f]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z15wsrep_abort_thdPvS_c+0x139)[0x7f0c8d95f8d9]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z25wsrep_grant_mdl_exceptionP11MDL_contextP10MDL_ticketPK7MDL_key+0x2a7)[0x7f0c8d9503e7]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_ZNK8MDL_lock14can_grant_lockE13enum_mdl_typeP11MDL_contextb+0x11f)[0x7f0c8d904a8f]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_ZN11MDL_context21try_acquire_lock_implEP11MDL_requestPP10MDL_ticket+0xf1)[0x7f0c8d905711]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_ZN11MDL_context12acquire_lockEP11MDL_requestd+0x2e)[0x7f0c8d905c7e]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_ZN11MDL_context19upgrade_shared_lockEP10MDL_ticket13enum_mdl_typed+0xae)[0x7f0c8d90696e]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z17mysql_alter_tableP3THDPcS1_P14HA_CREATE_INFOP10TABLE_LISTP10Alter_infojP8st_orderb+0x12fd)[0x7f0c8d8b882d]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_ZN19Sql_cmd_alter_table7executeEP3THD+0x61a)[0x7f0c8d8ff94a]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1206)[0x7f0c8d82c276]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x28e)[0x7f0c8d8348de]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(+0x4380e9)[0x7f0c8d8350e9]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1fb0)[0x7f0c8d837770]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x169)[0x7f0c8d838619]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f0c8d8fcc1a]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f0c8d8fcdf0]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /lib64/libpthread.so.0(+0x7dc5)[0x7f0c8cfc4dc5]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: /lib64/libc.so.6(clone+0x6d)[0x7f0c8b3e6ced]
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Trying to get some variables.
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Some pointers may be invalid and cause the dump to abort.
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Query (0x7f0c877e4020): ALTER TABLE zabbix.history ADD PARTITION (PARTITION p2016_10_02 VALUES less than (UNIX_TIMESTAMP("2016-10-03") div 1))
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Connection ID (thread ID): 142346
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Status: NOT_KILLED
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
    Sep 22 03:00:04 zsrv-c3-c00001-g mysqld: information that should help you find out what is causing the crash.
    So far, it has just been a minor annoyance that we can recover from, but I'm worried that we won't always be so lucky.

    Thanks!

    __Jason
    Last edited by guzzijason; 22-09-2016, 16:36. Reason: Fixed mariadb version info
  • shinguz
    Junior Member
    • Apr 2012
    • 4

    #2
    Use stable MySQL

    It looks like your MariaDB has some quality problems (aka crashing). Use stable and reliable MySQL any your problems are gone...

    A stack trace is never the users problem. It is always a sign that something in the software happened what the developer did not care about.

    Comment

    • mbsit
      Senior Member
      • Sep 2012
      • 130

      #3
      shinguz: I'm sorry but ... bullshit.

      We use the same version of MariaDB (stable) with Galera inside installed from official repo.
      The same problem, the same crush, the same reason, the same solution.

      After we disable wsrep replication, problem disappear.

      Bests,
      Grzegorz
      Pozdrawiam
      Grzegorz Grabowski
      ____
      WdroĊĵenia, szkolenia, umowy serwisowe
      Warszawa - Polska

      Comment

      • guzzijason
        Senior Member
        • Dec 2015
        • 106

        #4
        Yeah, I'm of the opinion that if mariadb is good enough to become the standard DB package for Redhat, then it's probably good enough for me. I have no desire to go back to MySQL, for a variety of reasons.

        Good to know about the wsrep thing - will keep that in mind. I've updated to the most recent version (MariaDB-server-10.1.17-1.el7) and will keep an eye on it. So far, the problem has only happened 2 times since the cluster was built and is easy to recover from, so has just been a relatively minor inconvenience. If it happens again, I'll probably look into a bug report with MariaDB.

        Thanks,

        __Jason

        Comment

        • guzzijason
          Senior Member
          • Dec 2015
          • 106

          #5
          I've opened a bug report with mariadb here: https://jira.mariadb.org/browse/MDEV-10958

          We'll see if this goes anywhere.

          __Jason

          Comment

          • guzzijason
            Senior Member
            • Dec 2015
            • 106

            #6
            I also noticed that mariadb-server-10.1.18 was recently released. One of the fixes in it is somewhat similar to the problem I'm seeing - but I'm not 100% convinced they are the same. At any rate, I've updated to 10.0.18 and will see how it goes.

            __Jason

            Comment

            • guzzijason
              Senior Member
              • Dec 2015
              • 106

              #7
              FYI, the update to 10.1.18 seems to have resolved this issue. I've not experienced a crash since moving to that version about a month ago.

              __Jason

              Comment

              • mschedrin
                Senior Member
                • Jun 2009
                • 179

                #8
                Guys, I still don't dare to use galera cluster with zabbix partitioned database. The main reason is that galera cluster doesn't work properly with tables without primary key (https://mariadb.com/kb/en/mariadb/ma...n-limitations/), however zabbix DB partititioning guide does that(https://www.zabbix.org/wiki/Docs/how...#Index_changes) .
                As far as I understand such setup is not supposed to work properly. Am I wrong at some point?

                Comment

                • guzzijason
                  Senior Member
                  • Dec 2015
                  • 106

                  #9
                  So far, I've been running with galera for about 6 months with no particular issues noticed (except for the one that caused me to start this thread, which has since been resolved).

                  __Jason

                  Comment

                  Working...