Ad Widget

**ulistaerk** · 11-05-2011, 12:10

I've setup up the distributed monitoring example:
- Master (1)
- Child (2)

Now I assume Child (2) crashed, and does not work any more. Master (1) still has ALL configuration and history data still available. I've created a new Child (2) node, and did the basic DM-configuration:

Code:

mysql -e 'drop database zabbix; create database zabbix;'
cat ~/zabbix-1.8.5/create/schema/mysql.sql ~/zabbix-1.8.5/create/data/data.sql | mysql zabbix
zabbix_server --config /etc/zabbix/zabbix_server.conf --new-node 2
zabbix_server --config /etc/zabbix/zabbix_server.conf

After reconnecting the Child (2) I can the see data transfers:

Code:

 25756:20110511:114158.701 NODE 1: Received configuration changes from slave node 2 for node 2 datalen 4449009
 25756:20110511:114239.511 NODE 1: Sending configuration changes to slave node 2 for node 2 datalen 8

But ... the configuration from Child (2) was not correctly restored (as you can see from the small dataset-size) and no error message appears (evil). In fact, the synchronisation is completly broken and does unexpected things. If you start changing data, you risk damaging your data consistency (like duplicate/empty hostnames or lost hostgroups in my case).

This causes doubts if the cluster will reorganize itself consistently if some serious error happened.

I think in this case it sould be possible to completly recover with a child with the data from the master. There should be a template to set up a plain node and a tool to completly resync a damaged node database (optionally including history/trends/events/...), before starting the zabbix_server.

Ad Widget

DM: Resync Node

DM: Resync Node