Ad Widget

Collapse

Patch - ServerSite parameter. First step to different distributed monitoring approach

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Shmuma
    Member
    • Nov 2007
    • 49

    #1

    Patch - ServerSite parameter. First step to different distributed monitoring approach

    Hello,

    attached patch adds new host attribute called 'site' which allows to mask
    monitored hosts from being processed by zabbix servers when this attribute of host doesn't match with value specifed in zabbix_server.conf.

    Exact list of changes and their meaning:
    1. there is new attribute of monitored host called site. This is just a string with description, representing the place (datacenter, room, organization etc) where the host are.
    2. there is new attribute in zabbix_server.conf called ServerSite.
    3. When server performs operations on hosts' data it checks for host's site value. If value of ServerSite match with host's value, data is processed. Under data I mean events, items, triggers, ping requests, http checks etc.

    Why all this needed: During integration of zabbix into my company's infrastructure I faced with the problem. The problem in fact that ditributed monitoring model implemented in zabbix is one of two possible. Personaly, I call it 'down-top'. This mean that zabbix server of down level (leaves of tree) administered independently and (possibly) by different peoples. Then our top server aggregates all data and we have the whole picture.

    But this doesn't always work. When we have several datacenters administered by the same people we have little use of such approach. Imagine google mail admins. They have several data centers, but roles of servers and checks performed are the same for all servers in DC. Down-top scheme leads to additional work and mess on top level (if we have three DC, we will see three copies of the same templates on top zabbix server). Useable scheme in that case would be to have one central place where we can configure hosts/items/triggers/events/etc and this configuration replicated to our down servers in DCs around the world. I call this approach 'top-down'.

    Afaik, zabbix implement 'down-top' approach, but have nothing to solve 'top-down' problems. Thus, because of I like zabbix and don't see better system, I perform modifications to implement top-down scheme.

    The main idea of the scheme I am trying to implement is simple:
    1. distributed monitoring is turned off on all zabbix servers,
    2. there is one top oracle database which asyncronously replicates tables with systems configuration (items, hosts, triggers, etc) to lower MySQL databases, one per DC.
    3. this MySQL DB have zabbix server which accepts data from agents and processes triggers,
    4. periodicaly, this lower-level servers send new data to top oracle DB.

    This patch resolves the problem with such configuration replicas: we must have a way to say to zabbix server which hosts it can process and which cannot.

    I attached this patch in two versions: as big chunk agains 1.4.4 code and GIT patchset to make things clear to learn (if someone is interested).

    P.S. sorry for my english, it's terrible, I know .
    P.P.S. I don't hope that this patch will become part of official zabbix (the method I used is far from ideal). I only want to attract attention to this problem and make my conscience feel better -- maybe this can be useful for someone else.
    Attached Files
Working...