Ad Widget

**qix** · 30-08-2007, 13:24

*bump*

Is there anybody who can give me a few pointers?
Alexei, can you tell me what you would recommend?

**qix** · 16-10-2007, 10:15

*bump*

I really need some input from somebody that has done something like this before.

Thanks in advance,

**just2blue4u** · 17-10-2007, 11:20

please see my post http://www.zabbix.com/forum/showpost...13&postcount=9
in this thread: http://www.zabbix.com/forum/showthread.php?t=4104

My model shows 2 independend Zabbix instances. I didn't test it, but maybe this works with master/slave node model combined?

**NOB** · 19-10-2007, 09:02

Hi,

we have the same scenario (more or less), i.e. two Datacenters (or more).
With more datacenters you have to distribute the servers more intelligently over the datacenters to cover for HW problems of one server.

One of our main concerns is not just a HW problem with one
of the servers but a complete loss of a data center, too.
This has happened in the past and we want to have a monitoring solution
covering both cases.

So my proposal is to use the following:

Build up one active ZABBIX server per datacenter with a passive ZABBIX server in the other datacenter.
Use a virtual IP-address which will get switched (either automically
or manually) in case of a HW problem and MySQL replication between these
two servers. The ZABBIX server processes are switched, too, if necessary.
All these ZABBIX servers are completely installed, i.e. with the Web frontend.
Instead of DB replication one could use mirrored SAN connections
as well, but this is expensive and the availability as well as the performance might be lower than using local (mirrored/striped) disks.

So, that means four servers will cover the case of loss of datacenter and a HW problem in one of the servers.

For convenience of our operation people (one view for all monitored servers/services) we are thinking about adding one global ZABBIX master
server which will gather the data from the ZABBIX servers per datacenter.
Of course, for redundance we need two of those as well but this
just for reasons of HW problems.
If one datacenter is lost, the operation people just use the frontend
in the other (remaining) datacenter or use the virtual IP until
the first datacenter is back up, again.

As always, the tricky part is to get the switching of the application and virtual IP address right.
I've seen several cases where: either both servers were trying to be active
or the first server went down and a small amount of time later the other
went down, too ...

This should work, I hope.

What is your opinion ?

I know my proposal is not complete. Of course, you want to define triggers covering systems in both data centers - like distributed clusters.
This has to be done on one ZABBIX server getting the data from servers
in both datacenters. For this purpose even more ZABBIX servers (active
and passive ones) are required. But those do not necessarily need a
complete installation with Apache, PHP and all that stuff for the frontend.

In addition, if you want use distributed monitoring inside the datacenters
to cover several customers with own networks not directly reachable from
the central ZABBIX servers it gets even more complicated.

But to propose a solution for that is our work, isn't it !
ZABBIX 1.6 (one GUI for all servers, latest data included)
will help solving this, I hope.

Regards

Norbert.

**Alexei** · 27-10-2007, 22:24

I am not sure that anything has to be done by ZABBIX software itself. All this can be achieved by database replication, virtual IPs and (or) using cluster solution for switch over and high availability.

**NOB** · 01-11-2007, 08:54

Originally posted by Alexei

I am not sure that anything has to be done by ZABBIX software itself. All this can be achieved by database replication, virtual IPs and (or) using cluster solution for switch over and high availability.

Yes, your are right.

Except for the single point mentioned:

All latest data in the central, global ZABBIX-Server to allow just one frontend
for all servers !

And, what would be a big plus:

Remove the requirement to have an Apache / PHP frontend on every server just to configure the master/slave relationship. This can be done by using
a scripts which does the same the frontend would do.

Regards,

Norbert.

**Alexei** · 01-11-2007, 10:22

Originally posted by NOB

Yes, your are right.

Except for the single point mentioned:

All latest data in the central, global ZABBIX-Server to allow just one frontend
for all servers !

And, what would be a big plus:

Remove the requirement to have an Apache / PHP frontend on every server just to configure the master/slave relationship. This can be done by using
a scripts which does the same the frontend would do.

Regards,

Norbert.

The central ZABBIX server has all data available. It is ONE frontend for all servers.

Currently the GUI is required for initial configuration of nodes only. It is quite straight forward to automate installation of nodes without the GUI as well. It just requires population of table 'nodes', nothing else.

ZABBIX 1.6 will support GUI-less installation (autoregistration?) of nodes. There are already many significant improvements made in the latest code related to DM.

**qix** · 07-11-2007, 13:34

Thanks for the reply all.
I'm afraid i cannot use virtual ip's because the subnets on each location are different.

What I have conceived is the following setup (see attached picture).
The Primary server is node 1, the secondary server is node 2. So this is a distributed setup.
I will use a third server where my databases are being replicated to (master-slave).

This will allow me to make backups of our (large) databases (+40GB / 2.5GB compressed) without database locks on the zabbix servers.

Secondly, this also allows me to schedule detailed reports generated from the zabbix database without performance loss on the monitoring servers.

Thirdly, when the s*** hits the fan, I can always use the reporting server as a spare zabbix server in case of failures, the database is already there, so it should be easy to get it up and running.

So failover isn't automatic, but that doesn't really matter at this point.
If the need arrises, maybe we will go to 4 servers so there is a spare zabbix server on each site, then automatic failover could be achieved. (I'm thinking VMware here

)

I'll try to keep you posted on how things are doing when I'm finished.

Attached Files

Ad Widget

Distributed monitoring and failover

Distributed monitoring and failover

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment