Ad Widget

Collapse

DM Master is missing events, triggers, and hosts

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • steev
    Member
    • Aug 2010
    • 38

    #16
    no replication

    no, I'm not doing anything fancy with mysql.

    I can't blame zabbix for all this. I think I've probably mangled the items in the master node's table that represent the hosts that the child node monitors. Grrr. No tools to force a sync so I'm probably going to have to manually import items into the master sql to sql.

    Comment

    • steev
      Member
      • Aug 2010
      • 38

      #17
      thanks...

      ... do you have the urls for those sites?

      seriously tho... I did find that the source of my deadlocks was setting dbsyncers to 12 in my z_server.conf. I removed it and haven't had the db locking problem since. I then manually cleaned up the db and it was looking good until we had some network trouble and the dbs got out of sync again.

      DM just doesn't look like it's going to work for me. It's too touchy. the databases seem to get out of sync too easily and then the master node isn't displaying all of the child node's alerts anymore. The only fix I've found for that is to clone the affected host, thereby losing all it's historical data.

      Only then can I see the missing alerts on the master again but then I seem to have to do this again, quite often, even when we're not having any network trouble. Sometimes the master zabbix node just 'forgets' triggers. They're 'unknown' on the master node until you create the host again.

      Comment

      • steev
        Member
        • Aug 2010
        • 38

        #18
        I fixed it...

        ... I think. I ran the following sql:

        update triggers set value=0 where status=0 and value=2;

        where I *think* i set the status back from unknown to good on any trigger on the master node that's not in some kind of error state.

        this seems to suddenly inspire the db sync process to start syncing again.

        I tested this by sending localtime every 30 seconds. for some reason this trigger, on nodata made the sync fail quite often as was evident in the overview page. They were grey and stayed that way on the master despite being green and happy on the node where the checks are actually processed.

        I did the above 'update', forcing them 'green' again and the next thing you know, the 'localtime' variable is being updated again on the overview page, data view. It had been 'stuck' before my sql run.

        I'll keep an eye on things, obviously but this seems to have fixed things for now... at least it's a way to fix things without completely removing and re-adding the host from the secondary node.

        Comment

        • qix
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Oct 2006
          • 423

          #19
          I'm happy that you found a workaround. I still don't understand why it doesn't work
          With kind regards,

          Raymond

          Comment

          • themons
            Senior Member
            • Feb 2005
            • 110

            #20
            Same issue

            I know it's an old post, but I have the same design as QIX and the same issue.

            Two physical machines SERVER1 and SERVER2
            Mysql MASTER-MASTER replication
            Apach and zabbix_server are allway running on th same server.

            In a normal world everything is running fine on SERVER1 if I switch apache and zabbix_server on SERVER2 I have this kind of messages:

            Code:
             14505:20120828:094712.259 [Z3005] Query failed: [1062] Duplicate entry '6414724' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414724,2,3,10,1346140032,1)]
             14517:20120828:094719.445 [Z3005] Query failed: [1062] Duplicate entry '6414725' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414725,2,3,7,1346140039,1)]
             14502:20120828:094740.997 [Z3005] Query failed: [1062] Duplicate entry '6414726' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414726,2,3,9,1346140060,1)]
             14508:20120828:094911.161 [Z3005] Query failed: [1062] Duplicate entry '6414727' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414727,2,3,8,1346140151,1)]
             14519:20120828:094912.285 [Z3005] Query failed: [1062] Duplicate entry '6414728' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414728,2,3,10,1346140152,1)]
             14508:20120828:094941.007 [Z3005] Query failed: [1062] Duplicate entry '6414729' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414729,2,3,9,1346140181,1)]
             14516:20120828:094948.740 [Z3005] Query failed: [1062] Duplicate entry '6414730' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414730,2,3,7,1346140188,1)]
            
            and 
            
             14497:20120828:062856.684 [Z3005] Query failed: [1062] Duplicate entry '6413945' for key 1 [update items set status=3,lastclock=1346128136,error='Not supported by Zabbix Agent' where itemid=36506;
             14486:20120828:071557.875 [Z3005] Query failed: [1062] Duplicate entry '6414108' for key 1 [update items set status=3,lastclock=1346130957,error='Not supported by Zabbix Agent' where itemid=33477;
             14485:20120828:071559.258 [Z3005] Query failed: [1062] Duplicate entry '6414109' for key 1 [update items set status=3,lastclock=1346130958,error='Not supported by Zabbix Agent' where itemid=33476;
             14496:20120828:084149.376 [Z3005] Query failed: [1062] Duplicate entry '6414499' for key 1 [update items set status=3,lastclock=1346136109,error='Type of received value [Collector is not started!] is not suitable for value type [Numeric (float)]' where itemid=33829;
             14540:20120828:085515.855 [Z3005] Query failed: [1062] Duplicate entry '16781' for key 1 [insert into service_alarms (servicealarmid,serviceid,clock,value) values(16781,81,1346136915,2)]
             14540:20120828:085515.870 [Z3005] Query failed: [1062] Duplicate entry '16782' for key 1 [insert into service_alarms (servicealarmid,serviceid,clock,value) values(16782,96,1346136915,2)]
             14540:20120828:085515.872 [Z3005] Query failed: [1062] Duplicate entry '16783' for key 1 [insert into service_alarms (servicealarmid,serviceid,clock,value) values(16783,76,1346136915,2)]
            I switch back to SERVER1 and everything is working fine. I anybody have an idea, or if Qix have the solution ?

            Thanks everybody
            Zabbix 1.8.3
            SLES 11 x64

            French Zabbix user

            Comment

            • qix
              Senior Member
              Zabbix Certified SpecialistZabbix Certified Professional
              • Oct 2006
              • 423

              #21
              Wait, you are doing mysql master-master and zabbix is only active on server 1 normally?

              When you say switching, you mean killing the procs on server 1 and starting them up on server 2?

              Did you setup mysql replication properly? The my.cnf has to stete unique server id's as to prevent collisions on the tables.
              With kind regards,

              Raymond

              Comment

              • themons
                Senior Member
                • Feb 2005
                • 110

                #22
                Yes exactly when Zabbix service is running only on one server.

                When I say switch, I use linux-ha (heartbeat) all the resources are stop on the primary and started on the secondary.

                For mysql I put this on the primary server

                Code:
                auto_increment_increment                                = 2
                auto_increment_offset                                   = 1
                and on the secondary server

                Code:
                auto_increment_increment                                = 2
                auto_increment_offset                                   = 22
                It's strange it remember I put 2 instead of 22
                Zabbix 1.8.3
                SLES 11 x64

                French Zabbix user

                Comment

                • qix
                  Senior Member
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Oct 2006
                  • 423

                  #23
                  Originally posted by themons
                  It's strange it remember I put 2 instead of 22
                  Did you use vi/vim?

                  Code:
                  <2><i><2><esc>
                  Anyway, I no longer have access to the mysql master-master setup so I can't find out what the exact settings were on my end.
                  In addition to the auto incement settings, I believe you also need a server-id parameter.

                  Is replication running ok or is it hanging on one of the servers?
                  see http://dev.mysql.com/doc/refman/5.0/...ve-status.html
                  It should list:
                  Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                  With kind regards,

                  Raymond

                  Comment

                  • themons
                    Senior Member
                    • Feb 2005
                    • 110

                    #24
                    Now I remember I already had the problem and I try change the value from 2 to 22 but like today the only way to correct this issue was to faillback on the primary node.

                    The replication is running fine on both nodes I have:

                    Code:
                    Slave_IO_Running: Yes
                    Slave_SQL_Running: Yes
                    and server_id are set correctlly.

                    Every night a script stop the replication on the secondary node and do a full backup off Zabbix DB.

                    The is just on strange thing th error message is

                    Code:
                    [Z3005] Query failed: [1062] Duplicate entry '6414730' for key 1 [insert into events (eventid,source,object,objectid,clock,value) values (6414730,2,3,7,1346140188,1)]
                    I am not really good at MySQL but if you have a table with an auto increment value you don’t have to put it in the insert request. Am I right?

                    But I think I found something.

                    Event table is not auto_increment

                    Code:
                    CREATE TABLE `events` (
                    	`eventid`                bigint unsigned                           NOT NULL,
                    	`source`                 integer         DEFAULT '0'               NOT NULL,
                    	`object`                 integer         DEFAULT '0'               NOT NULL,
                    	`objectid`               bigint unsigned DEFAULT '0'               NOT NULL,
                    	`clock`                  integer         DEFAULT '0'               NOT NULL,
                    	`value`                  integer         DEFAULT '0'               NOT NULL,
                    	`acknowledged`           integer         DEFAULT '0'               NOT NULL,
                    	`ns`                     integer         DEFAULT '0'               NOT NULL,
                    	`value_changed`          integer         DEFAULT '0'               NOT NULL,
                    	PRIMARY KEY (eventid)
                    ) ENGINE=InnoDB;
                    So somewhere there is in the source code of the server something to define what the value of new eventids is. There is a table called ids if I compare on server1 and server2 the result of the following request (run at the same time)

                    SERVER1

                    Code:
                    mysql> select * from ids where table_name="events";
                    +--------+------------+------------+---------+
                    | nodeid | table_name | field_name | nextid  |
                    +--------+------------+------------+---------+
                    |      0 | events     | eventid    | [COLOR="red"][B]6515336 [/B][/COLOR]|
                    +--------+------------+------------+---------+
                    1 row in set (0.00 sec)
                    
                    mysql> select max(eventid) from events;
                    +--------------+
                    | max(eventid) |
                    +--------------+
                    |      6515237 |
                    +--------------+
                    1 row in set (0.00 sec)
                    
                    mysql> select count(eventid) from events;
                    +----------------+
                    | count(eventid) |
                    +----------------+
                    |        5636757 |
                    +----------------+
                    1 row in set (1.12 sec)

                    SERVER2

                    Code:
                    mysql> select * from ids where table_name="events";
                    +--------+------------+------------+---------+
                    | nodeid | table_name | field_name | nextid  |
                    +--------+------------+------------+---------+
                    |      0 | events     | eventid    | [COLOR="red"][B]6483817 [/B][/COLOR] |
                    +--------+------------+------------+---------+
                    1 row in set (0.02 sec)
                    
                    mysql> select max(eventid) from events;
                    +--------------+
                    | max(eventid) |
                    +--------------+
                    |      6515237 |
                    +--------------+
                    1 row in set (0.00 sec)
                    
                    mysql> select count(eventid) from events;
                    +----------------+
                    | count(eventid) |
                    +----------------+
                    |        5636757 |
                    +----------------+
                    1 row in set (7.18 sec)

                    It look ids table is not synced ???? and I don’t now why other look to be synced correctly.

                    Any idea ?
                    Zabbix 1.8.3
                    SLES 11 x64

                    French Zabbix user

                    Comment

                    • qix
                      Senior Member
                      Zabbix Certified SpecialistZabbix Certified Professional
                      • Oct 2006
                      • 423

                      #25
                      Can you post output for 'SHOW SLAVE STATUS' for both servers?
                      With kind regards,

                      Raymond

                      Comment

                      • themons
                        Senior Member
                        • Feb 2005
                        • 110

                        #26
                        Your right

                        Server1

                        Code:
                        mysql> show slave status \G
                        *************************** 1. row ***************************
                                     Slave_IO_State: Waiting for master to send event
                                        Master_Host: xxx.xxx.xxx.112
                                        Master_User: replic_slave
                                        Master_Port: 3306
                                      Connect_Retry: 60
                                    Master_Log_File: mysql-bin.001596
                                Read_Master_Log_Pos: 373306530
                                     Relay_Log_File: mysqld-relay-bin.032716
                                      Relay_Log_Pos: 235
                              Relay_Master_Log_File: mysql-bin.001596
                                   Slave_IO_Running: Yes
                                  Slave_SQL_Running: Yes
                                    [COLOR="red"]Replicate_Do_DB:[/COLOR]
                                Replicate_Ignore_DB:
                                 Replicate_Do_Table:
                             Replicate_Ignore_Table:
                           [COLOR="red"] Replicate_Wild_Do_Table:[/COLOR]
                        Replicate_Wild_Ignore_Table:
                                         Last_Errno: 0
                                         Last_Error:
                                       Skip_Counter: 0
                                Exec_Master_Log_Pos: 373306530
                                    Relay_Log_Space: 235
                                    Until_Condition: None
                                     Until_Log_File:
                                      Until_Log_Pos: 0
                                 Master_SSL_Allowed: No
                                 Master_SSL_CA_File:
                                 Master_SSL_CA_Path:
                                    Master_SSL_Cert:
                                  Master_SSL_Cipher:
                                     Master_SSL_Key:
                              Seconds_Behind_Master: 0
                        1 row in set (0.00 sec)
                        Server2

                        Code:
                        mysql> show slave status \G
                        *************************** 1. row ***************************
                                     Slave_IO_State: Waiting for master to send event
                                        Master_Host: xxx.xxx.xxx.111
                                        Master_User: replic_slave
                                        Master_Port: 3306
                                      Connect_Retry: 60
                                    Master_Log_File: mysql-bin.000005
                                Read_Master_Log_Pos: 241314594
                                     Relay_Log_File: mysqld-relay-bin.000014
                                      Relay_Log_Pos: 241314731
                              Relay_Master_Log_File: mysql-bin.000005
                                   Slave_IO_Running: Yes
                                  Slave_SQL_Running: Yes
                                    [COLOR="red"]Replicate_Do_DB: zabbix[/COLOR]
                                Replicate_Ignore_DB:
                                 Replicate_Do_Table:
                             Replicate_Ignore_Table:
                            [COLOR="red"]Replicate_Wild_Do_Table: zabbix.%[/COLOR]
                        Replicate_Wild_Ignore_Table:
                                         Last_Errno: 0
                                         Last_Error:
                                       Skip_Counter: 0
                                Exec_Master_Log_Pos: 241314594
                                    Relay_Log_Space: 241314731
                                    Until_Condition: None
                                     Until_Log_File:
                                      Until_Log_Pos: 0
                                 Master_SSL_Allowed: No
                                 Master_SSL_CA_File:
                                 Master_SSL_CA_Path:
                                    Master_SSL_Cert:
                                  Master_SSL_Cipher:
                                     Master_SSL_Key:
                              Seconds_Behind_Master: 0
                        1 row in set (0.00 sec)
                        Ok ther is a problem. But as the problem is from SRV1 to SRV2 I don't real understand...

                        Anyway the master configuration looks good.

                        SERVER1

                        Code:
                        mysql> show master status;
                        +------------------+-----------+--------------+------------------+
                        | File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB |
                        +------------------+-----------+--------------+------------------+
                        | mysql-bin.000005 | 259933286 | zabbix       | mysql,test       |
                        +------------------+-----------+--------------+------------------+
                        1 row in set (0.00 sec)
                        SERVER2

                        Code:
                        mysql> show master status;
                        +------------------+-----------+--------------+------------------+
                        | File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB |
                        +------------------+-----------+--------------+------------------+
                        | mysql-bin.001596 | 373306530 | zabbix       | mysql,test       |
                        +------------------+-----------+--------------+------------------+
                        1 row in set (0.00 sec)
                        other thing.

                        If now I recheck table ids

                        SERVER1

                        Code:
                        mysql> select * from ids where table_name="events";
                        +--------+------------+------------+---------+
                        | nodeid | table_name | field_name | nextid  |
                        +--------+------------+------------+---------+
                        |      0 | events     | eventid    | 6520456 |
                        +--------+------------+------------+---------+
                        1 row in set (0.00 sec)
                        SERVER2

                        Code:
                        mysql> select * from ids where table_name="events";
                        +--------+------------+------------+---------+
                        | nodeid | table_name | field_name | nextid  |
                        +--------+------------+------------+---------+
                        |      0 | events     | eventid    | 6488937 |
                        +--------+------------+------------+---------+
                        1 row in set (0.00 sec)
                        Value of nextid have been updated is it possible this value is update by a request like nextid+1 ?
                        Zabbix 1.8.3
                        SLES 11 x64

                        French Zabbix user

                        Comment

                        • qix
                          Senior Member
                          Zabbix Certified SpecialistZabbix Certified Professional
                          • Oct 2006
                          • 423

                          #27
                          Hi,

                          You need to recheck all the settings you made with regard to the sync'ed tables and databases.

                          Your server 1 master proces is reporting 259933286 as the position and your server2 slave proces is stuck at position 241314594. My guess is it never did replicate properly as there is a huge difference there.

                          Have you ever tried to restore a sqldump backup from server 2?

                          Anyway, I'd say:
                          1. find out what parameters are wrong
                          2. stop zabbix
                          3. stop replication to both servers
                          4. fix settings
                          5. dump sqldb on server 1
                          6. import sqldb on server 2
                          7. resync and start the replication


                          This might take a while though
                          With kind regards,

                          Raymond

                          Comment

                          • themons
                            Senior Member
                            • Feb 2005
                            • 110

                            #28
                            If I launch at the same time it's better

                            Master:
                            Code:
                            mysql> show master status;
                            +------------------+-----------+--------------+------------------+
                            | File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB |
                            +------------------+-----------+--------------+------------------+
                            | mysql-bin.000005 | [COLOR="red"]534112070 [/COLOR]| zabbix       | mysql,test       |
                            +------------------+-----------+--------------+------------------+
                            1 row in set (0.00 sec)
                            Slave:
                            Code:
                            mysql> show slave status\G
                            *************************** 1. row ***************************
                                         Slave_IO_State: Waiting for master to send event
                                            Master_Host: 172.20.18.111
                                            Master_User: replic_slave
                                            Master_Port: 3306
                                          Connect_Retry: 60
                                        Master_Log_File: mysql-bin.000005
                                    [COLOR="red"]Read_Master_Log_Pos: 534112070[/COLOR]
                                         Relay_Log_File: mysqld-relay-bin.000014
                                          Relay_Log_Pos: 534112207
                                  Relay_Master_Log_File: mysql-bin.000005
                                       Slave_IO_Running: Yes
                                      Slave_SQL_Running: Yes
                                        Replicate_Do_DB: zabbix
                                    Replicate_Ignore_DB:
                                     Replicate_Do_Table:
                                 Replicate_Ignore_Table:
                                Replicate_Wild_Do_Table: zabbix.%
                            Replicate_Wild_Ignore_Table:
                                             Last_Errno: 0
                                             Last_Error:
                                           Skip_Counter: 0
                                    Exec_Master_Log_Pos: 534112070
                                        Relay_Log_Space: 534112207
                                        Until_Condition: None
                                         Until_Log_File:
                                          Until_Log_Pos: 0
                                     Master_SSL_Allowed: No
                                     Master_SSL_CA_File:
                                     Master_SSL_CA_Path:
                                        Master_SSL_Cert:
                                      Master_SSL_Cipher:
                                         Master_SSL_Key:
                                  Seconds_Behind_Master: 0
                            1 row in set (0.00 sec)
                            I search in the source of my version of zabbix. (1.8.3)

                            Each time a reference is made to zabbix.ids table your have someting like this:

                            Code:
                            frontends/php/include/db.inc.php
                            
                            $sql = 'UPDATE ids SET nextid=nextid+1 WHERE nodeid='.$nodeid.' AND table_name='.zbx_dbstr($table).' AND field_name='.zbx_dbstr($field);
                            or

                            Code:
                            src/libs/zbxdbhigh/db.c
                            
                            DBexecute("update ids set nextid=nextid+1 where nodeid=%d and table_name='%s'"
                                                                            " and field_name='%s'",
                                                                            nodeid,
                                                                            table->table,
                                                                            table->recid);
                            So I assume the sql request is nextid+1 I will try to update the value and have a look.

                            I cannot do it right now but I keep your informed

                            QIX : big thanks to you for the time you take to look at this.
                            Zabbix 1.8.3
                            SLES 11 x64

                            French Zabbix user

                            Comment

                            • qix
                              Senior Member
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Oct 2006
                              • 423

                              #29
                              You're welcome and good luck!
                              With kind regards,

                              Raymond

                              Comment

                              Working...