Ad Widget

**js1** · 04-08-2009, 07:43

Originally posted by walterheck

3) the zabbix binary. This one is a bit more tricky, and I was wondering how to best achieve redundancy? As far as I now see it, best is probably to install it on two different servers and then using something like keepalived to have automatic failover in case one of the servers dies.
I searched the forum and the wiki, but many solutions use either old (= for older zabbix releases) or unnecessarily complicated/slow technology (e.g. DRBD)

Has anybody done something like this?

You can always use heartbeat to manage the zabbix process. DRBD would only need to be used to sync the configs. A friend of mine uses DRBD on a file server that he manages. A zabbix config directory won't have that much i/o.

You're also going to need to share an IP address between the nodes that run the zabbix process.

**krimson** · 05-08-2009, 13:56

We use the RedHat clustersuite here. You should be able to do a similar thing with Fedora.

Ofcourse, you will need shared storage. Also keep in mind that the zabbix server will terminate if the MySQL server becomes unreachable.

**nelsonab** · 06-08-2009, 10:03

Originally posted by walterheck

3) the zabbix binary. This one is a bit more tricky, and I was wondering how to best achieve redundancy? As far as I now see it, best is probably to install it on two different servers and then using something like keepalived to have automatic failover in case one of the servers dies.
I searched the forum and the wiki, but many solutions use either old (= for older zabbix releases) or unnecessarily complicated/slow technology (e.g. DRBD)

Has anybody done something like this?

The solution I posted to the wiki was done with an older version but will still work with the current version. Yes DRBD can be slow initially but it does quite well once it's in sync, however if it goes out of sync, ya that can be a problem. The only reason I went with it was due to simplicity, I didn't want to have to rework the DB schema to have unique id's for every row, where inserts on one host were odd and the other were even. Yes a normal run is where one DB is master, but I was looking at master-master replication to allow for complete failover.

The frontend, there's no real way around it other than using a clustering management program like Linux-HA or Veritas cluster or something else. If you convert yourself to a fully (100%) active items setup then you might be able to get away with two active zabbix servers with a loadbalancer in front of it.

Good luck!

**walterheck** · 06-08-2009, 10:32

Hey guys,

thanks for teh suggestions!

I was thinking a bit more about this, and thought that we could actually use puppet to keep the config files equal on both servers. If that is enough, it would be a good way to not have to use extra technology, as implementing puppet was on our wishlist anyway

Then, as long as a virtual IP is used for the server, it shouldn't matter which one is handling the reports from the agents, right?

Walter

**nelsonab** · 06-08-2009, 19:31

I retract my earlier comment about it working with a fully active setup, and reinstate what I've been saying all along. You can only have one Zabbix server running at a time.

If you run two servers and have a 100% pure active setup the agents will then connect to both servers listed in their config file. They will then push data to both servers who in turn are pushing data into the DB. Which DB? The same one, or different ones. If you have a master-master setup with the DB you'll then need to modify the Zabbix schema to add MySQL generated ID's for tables that do not have them such as history, history_str and so forth. One server would then be setup to do even ID'd rows and the other odd. This is required so that MySQL knows which rows to propogate to the other server. Also the behavior of the Zabbix agent may be strange if it's getting two lists of active checks, I don't know if that is even supported or behaves as one would expect. It's possible the agent would get a list of active checks from both servers and then send that doubled up list back to both servers, generating 4 times the data in the DB and MUCH duplication.

Also when the server does passive checks, both servers are going to be sending the passive checks to agents. This will double the network bandwidth.

As you can see the main challenge here is the Zabbix server. You need to have only one running at a time and have it running with a virtual IP. Linux HA did a very good job with this. Also the MySQL back end you'll want to do something similar where you have a virtual IP active on on your "master" node. This way you can maintain the same server config file on both systems. Using puppet to update this is not very feasable as the refresh times would have to be rediculous. I think puppet is better capable at this than CFengine however as you'd need to write a script that can poll what your environment is like (which db is master etc) and generate the config accordingly.

**walterheck** · 06-08-2009, 23:06

Originally posted by nelsonab

I retract my earlier comment about it working with a fully active setup, and reinstate what I've been saying all along. You can only have one Zabbix server running at a time.

That was my understanding as well, and I agree with your further explanation. 2 active servers is going to get messy or it's just going to be a lot of work to get it running.

Originally posted by nelsonab

Also the MySQL back end you'll want to do something similar where you have a virtual IP active on on your "master" node.

We are for now satisfied with just having replication for the mysql backend.

Originally posted by nelsonab

This way you can maintain the same server config file on both systems. Using puppet to update this is not very feasable as the refresh times would have to be rediculous. I think puppet is better capable at this than CFengine however as you'd need to write a script that can poll what your environment is like (which db is master etc) and generate the config accordingly.

I wouldn't want to use puppet for updating active master or actually anything related to the HA part. I thought of using it for the config file of zabbix server and maybe even mysql as well.

**NOB** · 07-08-2009, 09:13

Hi

we are using MySQL Master-Master replication, a virtual IP
for the ZABBIX server which is switched either manually for, e.g.,
patch updates, or via UCARP software automatically.
UCARP switches the virtual IP adress, announces it on the network
and stops/starts the zabbix_server application listening only on the
virtual IP.

Be aware, that the agents are not pushing data to two servers if you configure
them with a line like

Code:

Server=10.0.0.1,10.0.0.2

They just send active check data to the first one, only !
The other servers are allowed to request data for passive checks, though.

That's why we use a virtual IP, say, 10.100.47.11 for active checks
and the two physical IPs of the servers, say, 10.100.47.33 and 10.100.47.34 for the passive checks.
So the agent configuration contains a line like

Code:

Server=10.100.47.11,10.100.47.33,10.100.47.34

and all works as expected. All active checks are send to the virtual IP,
wherever it is, the list of active checks will be retrieved from the same
virtual IP and both servers are allowed to do passive checks.
It's as easy as that !

Starting with the 1.6.x agents, the agents will cache data if the virtual IP
is not available during a switch.

Instead of using MySQL Master-Master replication you could put
the DB on external storage. Disadvantages are: another single point
of failure, the filesystem can get corrupted, ensure that just one
system has the filesystem mounted all the time.

HTH and YMMV

Norbert.

**Takanori Suzuki** · 25-12-2010, 16:24

Hi, I'm interested in redundant monitoring.
Recently, I made a patch for supporting multiple server in active check mode, though it is not accepted yet.
I think this feature may help to make active/active redundant monitoring.

Loading...

https://support.zabbix.com/browse/ZBXNEXT-584

**qix** · 28-12-2010, 12:44

Could it be a solution to do a "semi loadbalanced" zabbix server solution?

What I mean is the following:

You could use multiple zabbix server instances running Linux HA on multiple servers, each with their own virtual IP.

The load could then be shared over the physical servers by predefining one zabbix server instance as SNMP, Pinger and IPMI and the other as an Active/passive agent instance.

If 1 server fails, the other could failover the missing instance via Linux HA.

I haven't tried it, but from what I've seen I think it might work.

**alixen** · 03-01-2011, 17:27

Hi,

Originally posted by qix

You could use multiple zabbix server instances running Linux HA on multiple servers, each with their own virtual IP.

Zabbix already supports some kind of load balancing with distributed monitoring.
If Zabbix nodes are configured on HA clusters, we get high availability and load balancing.

Regards,
Alixen

**qix** · 03-01-2011, 17:34

Zabbix DM is a bit 'touchy' in my opinion.
I don't really think it's stable.
Plus, you need to configure all your templates and stuff twice, since it doesn't replicate them...which is a drag (export, import, etc.).

If you have an other experience I'd love to hear about it

**misch42** · 04-01-2011, 12:49

Hi,

Linux-HA is dead. Please consider using pacemaker. See: www.clusterlabs.org

Michael Schwartzkopff.

Ad Widget

Making the zabbix server redundant

Making the zabbix server redundant

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment