Ad Widget

Collapse

[crash] z-server 1.8.8 with sigsegv after migration to partitioned scheme

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • wax
    Junior Member
    • Sep 2011
    • 16

    #1

    [crash] z-server 1.8.8 with sigsegv after migration to partitioned scheme

    Hi all!

    Preamble:
    Recently I migrate my zabbix DB from one mysql server to another. New server have partitioned data scheme (I use this manual http://zabbixzone.com/zabbix/partitioning-tables/).
    After migrating to a new server everything was fine, untill I start adding new machines and metrics (items) to monitoring.
    Adding any metric with web interface causes zabbix-server to fall with SIGSEGV.

    Main:
    Both servers have same hardware and software configuration (sysinfo_report.log). MySql params are the same too (my.cnf.txt).
    zabbix_server.trouble.log ( http://rghost.net/25423551 ) shows fall proccess.

    Somebody can help me?

    PS:For now zabbix-server run normally with old metrices.
    PPS: Sorry for terrible my terrible english.
    Attached Files
    Last edited by wax; 29-11-2011, 13:54.
  • wax
    Junior Member
    • Sep 2011
    • 16

    #2
    sysinfo lie

    sysinfo show incorrect version of zabbix-server because I'd installed new one bypassing ports.

    Comment

    • LPby
      Junior Member
      • Aug 2008
      • 21

      #3
      Check Trigger with triggerid:29428.

      MySQL:
      SELECT * FROM `triggers` WHERE `triggerid` = 29428;

      is there any items?

      Comment

      • wax
        Junior Member
        • Sep 2011
        • 16

        #4
        Originally posted by LPby
        Check Trigger with triggerid:29428.

        MySQL:
        SELECT * FROM `triggers` WHERE `triggerid` = 29428;

        is there any items?
        Nope. =(
        How can i fix it?

        Comment

        • wax
          Junior Member
          • Sep 2011
          • 16

          #5
          Originally posted by LPby
          Check Trigger with triggerid:29428.

          MySQL:
          SELECT * FROM `triggers` WHERE `triggerid` = 29428;

          is there any items?
          I'll try to fix it with:
          1. SELECT concat('delete from ', `TABLE_SCHEMA`, '.' ,`TABLE_NAME`, ' where ', `COLUMN_NAME`, ' = 29428;') FROM `information_schema`.`COLUMNS` WHERE lower(`COLUMN_NAME`) like lower('%triggerid%');
          2. Executing result.

          Am I rigth?

          Comment

          • LPby
            Junior Member
            • Aug 2008
            • 21

            #6
            Try the following:

            1. Go to http://<YOUR_HOST>/zabbix/triggers.php?form=update&triggerid=29428 (may be your sid will be needed also)
            2. Delete it and then create the same one.
            Last edited by LPby; 13-10-2011, 17:29.

            Comment

            • wax
              Junior Member
              • Sep 2011
              • 16

              #7
              Originally posted by LPby
              Try the following:

              1. Go to http://<YOUR_HOST>/zabbix/triggers.php?form=update&triggerid=29428
              2. Delete it and then create the same one.
              Got this message:
              ERROR: No permissions !
              Could it be due to the fact that I already deleted the newly created hosts?

              Comment

              • LPby
                Junior Member
                • Aug 2008
                • 21

                #8
                Give me recent log of zabbix_server.

                Comment

                • wax
                  Junior Member
                  • Sep 2011
                  • 16

                  #9
                  Originally posted by LPby
                  Give me recent log of zabbix_server.
                  Here is log ( http://rghost.net/25457171 ).
                  There are 3 falls in it.
                  The first of these falls occurred immediately after the addition of (cloning) a host has multiple templates.
                  Second and third, when I'll try to restart it...
                  Then, I delete problematic host and zabbix-server was able to start.

                  Comment

                  • LPby
                    Junior Member
                    • Aug 2008
                    • 21

                    #10
                    I need log with loglevel=4 after server falling

                    Comment

                    • wax
                      Junior Member
                      • Sep 2011
                      • 16

                      #11
                      Originally posted by LPby
                      I need log with loglevel=4 after server falling
                      Here it is ( http://rghost.net/25460461 ).

                      Comment

                      • LPby
                        Junior Member
                        • Aug 2008
                        • 21

                        #12
                        OK. There is another problem.

                        Code:
                         92130:20111013:205922.144 [B]query[/B] [txnlev:1] [select h.host,i.key_ from hosts h,items i where h.hostid=i.hostid and [B]i.itemid=86416[/B]]
                         92130:20111013:205922.145 Item [WEB147:pg[postgres,eff]] became not supported: Cannot evaluate function [last()]
                         92130:20111013:205922.145 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
                         92130:20111013:205922.145 ====== Fatal information: ======
                         92130:20111013:205922.145 program counter not available for this architecture
                        check itemid=86416

                        Comment

                        • wax
                          Junior Member
                          • Sep 2011
                          • 16

                          #13
                          Originally posted by LPby
                          check itemid=86416
                          Got same error:
                          ERROR: No permissions !
                          while trying to open http://zabbix-webui/items.php?form=update&itemid=86416

                          Comment

                          • wax
                            Junior Member
                            • Sep 2011
                            • 16

                            #14
                            Code:
                            select count(1) from zabbix.functions where itemid = 86416
                             union all
                            select count(1) from zabbix.graphs where ymin_itemid = 86416
                             union all
                            select count(1) from zabbix.graphs where ymax_itemid = 86416
                             union all
                            select count(1) from zabbix.graphs_items where gitemid = 86416
                             union all
                            select count(1) from zabbix.graphs_items where itemid = 86416
                             union all
                            select count(1) from zabbix.history where itemid = 86416
                             union all
                            select count(1) from zabbix.history_log where itemid = 86416
                             union all
                            select count(1) from zabbix.history_str where itemid = 86416
                             union all
                            select count(1) from zabbix.history_str_sync where itemid = 86416
                             union all
                            select count(1) from zabbix.history_sync where itemid = 86416
                             union all
                            select count(1) from zabbix.history_text where itemid = 86416
                             union all
                            select count(1) from zabbix.history_uint where itemid = 86416
                             union all
                            select count(1) from zabbix.history_uint_sync where itemid = 86416
                             union all
                            select count(1) from zabbix.httpstepitem where httpstepitemid = 86416
                             union all
                            select count(1) from zabbix.httpstepitem where itemid = 86416
                             union all
                            select count(1) from zabbix.httptestitem where httptestitemid = 86416
                             union all
                            select count(1) from zabbix.httptestitem where itemid = 86416
                             union all
                            select count(1) from zabbix.items where itemid = 86416
                             union all
                            select count(1) from zabbix.items_applications where itemid = 86416
                             union all
                            select count(1) from zabbix.proxy_history where itemid = 86416
                             union all
                            select count(1) from zabbix.screens_items where screenitemid = 86416
                             union all
                            select count(1) from zabbix.trends where itemid = 86416
                             union all
                            select count(1) from zabbix.trends_uint where itemid = 86416;

                            Show me this:
                            count(1)
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            0
                            Last edited by wax; 14-10-2011, 10:57.

                            Comment

                            • wax
                              Junior Member
                              • Sep 2011
                              • 16

                              #15
                              Bad news everyone:
                              Code:
                               51357:20111014:143205.400 query [txnlev:1] [select distinct i.itemid,i.type,i.lastclock,i.delay,i.delay_flex from items i,functions f,triggers t where i.itemid=f.itemid and f.triggerid=t.triggerid and i.type not in (2) and t.triggerid=29294]
                               51357:20111014:143205.400 In calculate_item_nextcheck() itemid:85656 delay:180 flex_intervals:'' now:1318587725
                               51357:20111014:143205.400 End of calculate_item_nextcheck() nextcheck:1318587816 delay:180
                               51357:20111014:143205.400 In DBupdate_trigger_value()
                               51357:20111014:143205.400 In DBget_trigger_update_sql() triggerid:29294 old:2 new:2 now:1318587816
                               51357:20111014:143205.400 End of DBget_trigger_update_sql():FAIL
                               51357:20111014:143205.400 End of DBupdate_trigger_value()
                               51357:20111014:143205.400 query [txnlev:1] [select distinct i.itemid,i.type,i.lastclock,i.delay,i.delay_flex from items i,functions f,triggers t where i.itemid=f.itemid and f.triggerid=t.triggerid and i.type not in (2) and t.triggerid=29295]
                               51357:20111014:143205.401 In calculate_item_nextcheck() itemid:85660 delay:180 flex_intervals:'' now:1318587725
                               51357:20111014:143205.401 End of calculate_item_nextcheck() nextcheck:1318587820 delay:180
                               51357:20111014:143205.401 In DBupdate_trigger_value()
                               51357:20111014:143205.401 In DBget_trigger_update_sql() triggerid:29295 old:2 new:2 now:1318587820
                               51357:20111014:143205.401 End of DBget_trigger_update_sql():FAIL
                               51357:20111014:143205.401 End of DBupdate_trigger_value()
                               51357:20111014:143205.401 query [txnlev:1] [select distinct i.itemid,i.type,i.lastclock,i.delay,i.delay_flex from items i,functions f,triggers t where i.itemid=f.itemid and f.triggerid=t.triggerid and i.type not in (2) and t.triggerid=29694]
                               51357:20111014:143205.401 In calculate_item_nextcheck() itemid:86779 delay:1800 flex_intervals:'' now:0
                               51357:20111014:143205.401 End of calculate_item_nextcheck() nextcheck:379 delay:1800
                               51357:20111014:143205.401 In DBupdate_trigger_value()
                               51357:20111014:143205.401 In DBget_trigger_update_sql() triggerid:29694 old:2 new:2 now:379
                               51357:20111014:143205.401 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
                               51357:20111014:143205.401 ====== Fatal information: ======
                               51357:20111014:143205.401 program counter not available for this architecture
                               51357:20111014:143205.401 === Registers: ===
                               51357:20111014:143205.401 register dump not available for this architecture
                               51357:20111014:143205.402 === Backtrace: ===
                               51357:20111014:143205.402 7: 0x44c4a6 <zbx_vector_ptr_clear+342> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 6: 0x7fffffffffc4
                               51357:20111014:143205.402 5: 0x47374b <DBget_trigger_update_sql+987> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 4: 0x473b4f <DBupdate_triggers_status_after_restart+527> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 3: 0x4117f9 <MAIN_ZABBIX_ENTRY+617> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 2: 0x44cbe9 <daemon_start+889> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 1: 0x41158d <main+493> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 0: 0x40d4ee <_start+142> at /usr/local/sbin/zabbix_server
                               51357:20111014:143205.402 === Memory map: ===
                               51357:20111014:143205.402 memory map not available for this platform
                               51357:20111014:143205.402 ================================
                              Before deleting problematic hosts, I'd check this triggers. They was inherited from one template. What should I do to fix this?

                              Comment

                              Working...