Ad Widget

Collapse

Distributed Monitoring

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • knarfling
    Member
    • Sep 2006
    • 47

    #1

    Distributed Monitoring

    Okay, I admit it. I am lost. I cannot seem to get Distributed monitoring working. I seem to have the same symptoms as others, but the fixes do not work for me.

    On the master node I show

    NODES
    Id Name Type Time zone IP:Port
    1 /Scottsdale Local GMT+00:00 yyy.yyy.183.28:10051
    2 /Scottsdale/Phoenix Remote GMT+00:00 yy.yyy.172.50:10051

    on the Chile node I show only

    NODES
    Id Name Type Time zone IP:Port
    2 /Scottsdale/Phoenix Local GMT+00:00 yyy.yyy.172.50:10051

    From MySQL on the Master node "select * from nodes;" looks like
    +--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
    | nodeid | name | timezone | ip | port | slave_history | slave_trends | event_lastid | history_lastid | history_str_lastid | history_uint_lastid | nodetype | masterid |
    +--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
    | 1 | Scottsdale | 0 | yyy.yyy.183.28 | 10051 | 30 | 365 | 0 | 0 | 0 | 0 | 1 | 0 |
    | 2 | Phoenix | 0 | yyy.yyy.172.50 | 10051 | 90 | 365 | 0 | 0 | 0 | 0 | 0 | 1 |
    +--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
    2 rows in set (0.00 sec)

    On the Child node it looks like:

    +--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
    | nodeid | name | timezone | ip | port | slave_history | slave_trends | event_lastid | history_lastid | history_str_lastid | history_uint_lastid | nodetype | masterid |
    +--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
    | 1 | Scottsdale | 0 | yyy.yyy.183.28 | 10051 | 90 | 365 | 0 | 0 | 0 | 0 | 0 | 0 |
    | 2 | Phoenix | 0 | yyy.yyy.172.50 | 10051 | 30 | 365 | 0 | 0 | 0 | 0 | 1 | 1 |
    +--------+------------+----------+----------------+-------+---------------+--------------+--------------+----------------+--------------------+---------------------+----------+----------+
    2 rows in set (0.00 sec)

    My usersid query on the masters looks like:
    +-----------------+
    | userid |
    +-----------------+
    | 100000000000001 |
    | 100000000000002 |
    | 100000000000003 |
    | 100000000000004 |
    +-----------------+
    4 rows in set (0.00 sec)

    While on the child it looks like :
    +-----------------+
    | userid |
    +-----------------+
    | 200000000000001 |
    | 200000000000002 |
    | 200000000000003 |
    +-----------------+
    3 rows in set (0.00 sec)

    On the Child node the log seems to show that it is sending data to the master node.

    20737:20070912:215616 NODE 2: Sending new history_str of node 2 to node 1 datalen 14157
    20737:20070912:215626 NODE 2: Sending new events of node 2 to node 1 datalen 40300
    20737:20070912:215627 NODE 2: Sending new history of node 2 to node 1 datalen 376589
    20737:20070912:215627 NODE 2: Sending new history_uint of node 2 to node 1 datalen 353832
    20737:20070912:215627 NODE 2: Sending new history_str of node 2 to node 1 datalen 14157


    However, on the master node there is no indication that info is either being sent or received.

    26553:20070912:221119 server #16 started [Node watcher. Node ID:1]
    26555:20070912:221119 server #17 started [HTTP Poller]
    26557:20070912:221119 server #18 started [HTTP Poller]
    26559:20070912:221119 server #19 started [HTTP Poller]
    26561:20070912:221119 server #20 started [HTTP Poller]
    26543:20070912:221119 server #14 started [Timer]
    26552:20070912:221119 server #15 started [Poller for unreachable hosts. SNMP:ON]
    26563:20070912:221119 server #21 started [HTTP Poller]
    26517:20070912:221119 server #0 started [Watchdog]
    26567:20070912:221119 server #22 started [Discoverer. SNMP:ON]
    26541:20070912:221122 Deleted 2051 records from history and trends

    I am using RHEL4 (32 bit) and MySQL. I tried downloading the latest 1.4.3 from http://www.zabbix.com/developers.php although doing a zabbix_server - V gives me
    ZABBIX Server (daemon) v1.4.3 (25 August 2007)
    Compilation time: Sep 12 2007 21:29:44

    Can anyone tell me what I might be doing wrong?

    Thanx
  • marc
    Senior Member
    • Oct 2004
    • 146

    #2
    just to make sure.. how have you checked data? latest values will just updates for remote hosts if using zabbix-1.4.3 (at the moment just pre, available at the developers page). older versions will just show actions, events updated.

    Comment

    • knarfling
      Member
      • Sep 2006
      • 47

      #3
      Not sure I understand

      I am not sure I understand the question. Are you asking how I have checked to see if data is being transferred?

      If so, it is very obvious that no data is transferred. When I log in to the master node and switch nodes with current node only, it tells me that no triggers are defined. In Administration there are no users defined and under the Configuration tab there are no hosts defined.

      Also, the logs show that node2 is trying to transfer info to the master node, but the log on the master node does not show any data transfer at all. There is no record of data received and no record of trying to transfer data to the child node.

      The main point, one that I learned from a previous post, is that under nodes, both the master and child node should display two nodes. Each should display the local and remote node or DM will not work. From my previous post, the master node displays both nodes, but the child node only displays itself, not the master node. It does display itself as a child of a master, but does not have a line for the master node.

      In the previous post, there was some confusion as to which version was being run. The person reporting the trouble had 1.4.1 in /usr/local/bin/ and 1.4.2 in /usr/local/sbin/ and on one of the nodes he was running 1.4.1 instead of 1.4.2. To make sure I do not have the same trouble, I have deleted all zabbix binaries from /usr/local/bin and made sure that only 1.4.3 (The build from Sept 11) is the only one being used.

      Comment

      Working...