Ad Widget

**kloczek** · 02-04-2016, 16:02

To have shortest possible downtime on server you need to have two zabbix server database backends: master and slave.
Such setup is useful not only on upgrades but it raises HA of your DB backend.
In typical upgrade scenario you need to start from estimate how long will take upgrade database scheme. To do this measurement you can use slave by:
- stop slave
- backup database files or make snapshot if you have such possibility (LVM on Linux or zfs snapshot if you are using Solaris or zfs on Linux/*BSD)
- disconnect slave from master (on MySQL all what you need is remove master.info file in mysql datadir)
- install somewhere new zabbix server (it can be even temporary on DB slave host) and setup it to use your slave DB as db backend.
- block all communication to proxies if you have passive proxies. You can do this by add FW rules to block outgoing traffic to proxies or if you are using you proxies by hostname you can add in /etc/hosts you proxies names to 127.0.0.1
- start 3.0 server and observe logs noticing how long will take database scheme upgrade
- When you will have time of how long will take upgrade database scheme you must check all proxies settings to make sure that ProxyOfflineBuffer is longer than your database scheme upgrade time

Above it is only preparation to upgrade.
Exact upgrade consist from:
- stop zabbix server -> stop DB backend -> make DB backup -> start DB bakend
- start DB backend -> upgrade zabbix server. During database scheme upgrade proxy will be not able to receive or ask server for new monitoring configuration so they will be using last batch of cfg data from server
- if zabbix srv<>prx communication will be affected by protocol changes proxy still will be able to collect all monitoring data allowing to upgrade proxy one by one
- probability that proxy_history table definition is de facto null so upgrade of proxy database scheme will be matter of seconds and after this proxy will be able to send not send data to server.

On upgrade from 1.8 to 2.0 I've been using slightly modified above scenario.
Problem was that all history tables have been changed by add ns (nano seconds) column. Changing this on hundredths GBs of history* tables data was very time consuming stretching whole upgrade to tenths of hours which was not acceptable to us to not have so long monitoring and alarming.
What I've done was a bit tricky but is is been working for me and it allowed me o reduce upgrade time from tenths hours to few minutes.
1.8 -> 2.0 upgrade was done by apply manually upgrade.sql script doing all alter tables sequentially. What I've done was dividing this script in two parts: one was with apply alter tables on history tables by add ns column,This part was executed on slave first. Because ns column was added as last one still was possible to sync data from master using tables without this column (because new table definition used in +2.0 contains default value 0 of this column if value is not passed).
So after this I was able to block all on slave syncing all data to all tables except history* and trends*.
At this point I've announced in our company stop doing any changes in zabbix and use it only to observe alarms and data.
When after more than 22h altering history* tables on slave I was able to start syncing only history* and trends* data from master.

Above was only kind of first stage of whole change/upgrade.
On this stage was possible to use 2.0 web frontend with slave DB to confirm that all monitoring data are presented correctly +/- this that some parts of the web frontend been producing some errors because rest of the DB scheme was like on 1.8.

Main part was consisting from:
- stop zabbix seerver
- upgrade zabbix server software
- change zabbix server settings to use slave DB backend
- start zabbix server upgrade which took only something like 10min
- when zabbix DB scheme was upgraded last change was upgrade web frontend to 2.0 and confirm that everything is OK.
Last stage was of course upgrade all proxies

What was main good point of this scenario? Because original master was not touched and all changes have been done on slave it was possible to rollback whole change by downgrade web frontend and zabbix server and after change setup to use original master DB address was possible to back to business.

I've done above about two years ago and from now I'm doing every major upgrade by leaving master DB -> promoting slave as new master -> do zabbix upgrade.
Main advantage of such procedure is possibility of rolling back whole procedure and with only some small adaptations it can be done on any OS.

So again: on using any size zabbix installation slave DB is very important.
Not only from point of view of have higher HA of DB resources. It makes possible upgrade of whole stack way less stressful and if anything will go wrong (from whatever reasons) will be possible to rollback such upgrade.
Using slave allows as well shorten zabbix server upgrade to matter of single minutes (no matter how complicated DB changes will introduces).
If all you software will be well packaged you will be able to upgrade and roll back software change to shortest possible time. This is last maybe less important bit of upgrade but in reality is essential as well.
If anyone is using not partitioned zabbix DB adding slave should be first stage of partitioning. Why? Because it will be possible make whole partitioning on slave -> sync data from lave and failover to slave as new master in matter of seconds.
Slave allows as well do some DB maintenance like optimize tables (all except history* and trends*). It can be used as well on some tests o other experiment.

So final advise: you maybe don't need to have exactly second host to do upgrade. You need to have slave database and this slave should be regular part of you zabbix stack.

**Zadralo23** · 03-04-2016, 07:30

Hello.
Thank for you answer.
I have slave servers. Slave server this another server with different IP and stand by PostgreSQL mode, Zabbix Server stopped.
Main problem:
I planning update Zabbix Server. Test this. And after long time update proxy. But this way is bad, because in Zabbix 3.0 change DB structure with first start. I must update Zabbix Server and Zabbix Proxy at sime time.

Thank you about ProxyOfflineBuffer,I forgot about it.
I will continue to think. How change OS version, Zabbix version and minimize stop time.

Regards,
Sergey.

**Rinus Tinus** · 03-04-2016, 22:25

snmp discovery syntax is updated in version 3.0

Zadralo23,

I have upgraded last week, new installation however we exported all templates and hosts.

We first updated the server, and the proxy's later on. The simply didn't work due to a change check it over here (backward compatibility matrix

12 Version compatibility

https://www.zabbix.com/documentation/3.0/manual/appendix/compatibility

In our case we installed a new server exported hosts/templates and imported them. Resulting in only 1 thing that went wrong (by design) is all snmp discovery scripts stopped working. There was a small change to its syntax nothing much but they didn't work for us by default:

3 Low-level discovery

https://www.zabbix.com/documentation/3.0/manual/discovery/low_level_discovery

ctrl+f snmp oid (syntax changed to discovery{oid] instead of only [oid].

Just keep it in mind if you hit the same snmp problem and need any help let me know perhaps I can help out.

**kloczek** · 04-04-2016, 11:57

Originally posted by Rinus Tinus

Zadralo23,

I have upgraded last week, new installation however we exported all templates and hosts.

We first updated the server, and the proxy's later on. The simply didn't work due to a change check it over here (backward compatibility matrix

12 Version compatibility

https://www.zabbix.com/documentation/3.0/manual/appendix/compatibility

As I wrote proxy in such cases uses last batch of configuration data until it will be upgraded. From this point of view there is no loses in monitoring data.
Exporting and importing templates and hosts is not necessary as everything is in database and doing this is pointless in enough big monitored envs. Export and import may take hours and in especially big monitored envs people may expect that upgrade will be as short as possible.

Ad Widget

Zabbix 3.0 update Zabbix Proxy and other Questions

Zabbix 3.0 update Zabbix Proxy and other Questions

Comment

Comment

Comment

Comment