Okay, I admit it. I am lost. I cannot seem to get Distributed monitoring working. I seem to have the same symptoms as others, but the fixes do not work for me.
On the master node I show
NODES
Id Name Type Time zone IP:Port
1 /Scottsdale Local GMT+00:00 yyy.yyy.183.28:10051
2 /Scottsdale/Phoenix Remote GMT+00:00 yy.yyy.172.50:10051
on the Chile node I show only
NODES
Id Name Type Time zone IP:Port
2 /Scottsdale/Phoenix Local GMT+00:00 yyy.yyy.172.50:10051
From MySQL on the Master node "select * from nodes;" looks like
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| nodeid | name | timezone | ip | port | slave_history | slave_trends | event_lastid | history_lastid | history_str_lastid | history_uint_lastid | nodetype | masterid |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| 1 | Scottsdale | 0 | yyy.yyy.183.28 | 10051 | 30 | 365 | 0 | 0 | 0 | 0 | 1 | 0 |
| 2 | Phoenix | 0 | yyy.yyy.172.50 | 10051 | 90 | 365 | 0 | 0 | 0 | 0 | 0 | 1 |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
2 rows in set (0.00 sec)
On the Child node it looks like:
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| nodeid | name | timezone | ip | port | slave_history | slave_trends | event_lastid | history_lastid | history_str_lastid | history_uint_lastid | nodetype | masterid |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| 1 | Scottsdale | 0 | yyy.yyy.183.28 | 10051 | 90 | 365 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | Phoenix | 0 | yyy.yyy.172.50 | 10051 | 30 | 365 | 0 | 0 | 0 | 0 | 1 | 1 |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
2 rows in set (0.00 sec)
My usersid query on the masters looks like:
+-----------------+
| userid |
+-----------------+
| 100000000000001 |
| 100000000000002 |
| 100000000000003 |
| 100000000000004 |
+-----------------+
4 rows in set (0.00 sec)
While on the child it looks like :
+-----------------+
| userid |
+-----------------+
| 200000000000001 |
| 200000000000002 |
| 200000000000003 |
+-----------------+
3 rows in set (0.00 sec)
On the Child node the log seems to show that it is sending data to the master node.
20737:20070912:215616 NODE 2: Sending new history_str of node 2 to node 1 datalen 14157
20737:20070912:215626 NODE 2: Sending new events of node 2 to node 1 datalen 40300
20737:20070912:215627 NODE 2: Sending new history of node 2 to node 1 datalen 376589
20737:20070912:215627 NODE 2: Sending new history_uint of node 2 to node 1 datalen 353832
20737:20070912:215627 NODE 2: Sending new history_str of node 2 to node 1 datalen 14157
However, on the master node there is no indication that info is either being sent or received.
26553:20070912:221119 server #16 started [Node watcher. Node ID:1]
26555:20070912:221119 server #17 started [HTTP Poller]
26557:20070912:221119 server #18 started [HTTP Poller]
26559:20070912:221119 server #19 started [HTTP Poller]
26561:20070912:221119 server #20 started [HTTP Poller]
26543:20070912:221119 server #14 started [Timer]
26552:20070912:221119 server #15 started [Poller for unreachable hosts. SNMP:ON]
26563:20070912:221119 server #21 started [HTTP Poller]
26517:20070912:221119 server #0 started [Watchdog]
26567:20070912:221119 server #22 started [Discoverer. SNMP:ON]
26541:20070912:221122 Deleted 2051 records from history and trends
I am using RHEL4 (32 bit) and MySQL. I tried downloading the latest 1.4.3 from http://www.zabbix.com/developers.php although doing a zabbix_server - V gives me
ZABBIX Server (daemon) v1.4.3 (25 August 2007)
Compilation time: Sep 12 2007 21:29:44
Can anyone tell me what I might be doing wrong?
Thanx
On the master node I show
NODES
Id Name Type Time zone IP:Port
1 /Scottsdale Local GMT+00:00 yyy.yyy.183.28:10051
2 /Scottsdale/Phoenix Remote GMT+00:00 yy.yyy.172.50:10051
on the Chile node I show only
NODES
Id Name Type Time zone IP:Port
2 /Scottsdale/Phoenix Local GMT+00:00 yyy.yyy.172.50:10051
From MySQL on the Master node "select * from nodes;" looks like
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| nodeid | name | timezone | ip | port | slave_history | slave_trends | event_lastid | history_lastid | history_str_lastid | history_uint_lastid | nodetype | masterid |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| 1 | Scottsdale | 0 | yyy.yyy.183.28 | 10051 | 30 | 365 | 0 | 0 | 0 | 0 | 1 | 0 |
| 2 | Phoenix | 0 | yyy.yyy.172.50 | 10051 | 90 | 365 | 0 | 0 | 0 | 0 | 0 | 1 |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
2 rows in set (0.00 sec)
On the Child node it looks like:
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| nodeid | name | timezone | ip | port | slave_history | slave_trends | event_lastid | history_lastid | history_str_lastid | history_uint_lastid | nodetype | masterid |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
| 1 | Scottsdale | 0 | yyy.yyy.183.28 | 10051 | 90 | 365 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | Phoenix | 0 | yyy.yyy.172.50 | 10051 | 30 | 365 | 0 | 0 | 0 | 0 | 1 | 1 |
+--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
2 rows in set (0.00 sec)
My usersid query on the masters looks like:
+-----------------+
| userid |
+-----------------+
| 100000000000001 |
| 100000000000002 |
| 100000000000003 |
| 100000000000004 |
+-----------------+
4 rows in set (0.00 sec)
While on the child it looks like :
+-----------------+
| userid |
+-----------------+
| 200000000000001 |
| 200000000000002 |
| 200000000000003 |
+-----------------+
3 rows in set (0.00 sec)
On the Child node the log seems to show that it is sending data to the master node.
20737:20070912:215616 NODE 2: Sending new history_str of node 2 to node 1 datalen 14157
20737:20070912:215626 NODE 2: Sending new events of node 2 to node 1 datalen 40300
20737:20070912:215627 NODE 2: Sending new history of node 2 to node 1 datalen 376589
20737:20070912:215627 NODE 2: Sending new history_uint of node 2 to node 1 datalen 353832
20737:20070912:215627 NODE 2: Sending new history_str of node 2 to node 1 datalen 14157
However, on the master node there is no indication that info is either being sent or received.
26553:20070912:221119 server #16 started [Node watcher. Node ID:1]
26555:20070912:221119 server #17 started [HTTP Poller]
26557:20070912:221119 server #18 started [HTTP Poller]
26559:20070912:221119 server #19 started [HTTP Poller]
26561:20070912:221119 server #20 started [HTTP Poller]
26543:20070912:221119 server #14 started [Timer]
26552:20070912:221119 server #15 started [Poller for unreachable hosts. SNMP:ON]
26563:20070912:221119 server #21 started [HTTP Poller]
26517:20070912:221119 server #0 started [Watchdog]
26567:20070912:221119 server #22 started [Discoverer. SNMP:ON]
26541:20070912:221122 Deleted 2051 records from history and trends
I am using RHEL4 (32 bit) and MySQL. I tried downloading the latest 1.4.3 from http://www.zabbix.com/developers.php although doing a zabbix_server - V gives me
ZABBIX Server (daemon) v1.4.3 (25 August 2007)
Compilation time: Sep 12 2007 21:29:44
Can anyone tell me what I might be doing wrong?
Thanx
Comment