Hello,
based on another thread i was thinking about an extension to zabbix that would allow to
"Monitor a remote site reliable"
Idea
Basically we want to monitor a couple of sites connected via WAN-Links, VPN-Links or just the Internet (using an SSH-Tunnel or Stunnel etc.)
Usally we do not need to get access to all collected data 24/7, _but_ need to get informed if something goes wrong in one of these sites.
Lets assume that each site runs its own zabbix server with its own database and its individual items and triggers. This will keep that traffic for monitoring within the site.
Now we need a piece of *middleware* that does the following:
-A kind of Watchdog that monitors that the Zabbix-Server on Site is functional
-A kind of Watchdog that monitors that the WAN/VPN/Internet Link is functional
-A mechanism that forwards an event in case a triggers fires on that site.
In some cases it might be benefical, if a standard protocol like http can be used to transfer these data to avoid any firewall issues on site.
Also it would be great if zabbix wouldn't require patches.
Terminolgy
Central Server = Server that gets data (Watchdog and Events) from one ore more Satellite Server.
Satellite Server = Server on the site in question that passes Events to the Central Server if a trigger fires.
Watchdog = Mechanism to ensure that the Satellite has a connection to the Central Server.
Event = Data that is send from the Satellite Server to the Central Server because a trigger fires.
Draft
As outlined before, the zabbix Satellite Server would be configured as usal.
The only exception would be to:
-add a user item that validates that the connection to the Central Server is online (Watchdog).
This custom item can be a perlscript or shellscript or a simple wget that calls a specific webpage and passes ServerID+Password+DateTime+Status etc.
This webpage would be a php or perl script that passes the received data to a database (like mysql or postgre). Easiest would be a table within the zabbic database.
Adding also the local DateTime of the Central Server would allow to work even if the time between Central Server and Satellite Server would run out of sync.
-add a custom media that passes data to the Central Server in case a trigger fires on the Satellite Server. (Event)
If a trigger gets fired on the Satellite Server, in addition to the standard alerting procedure on that site, the resulting data would also be passed via a custom script (defined as an additional media) to the Central Server.
The mechanism could be very simlar to the way the watchdog is implemented.
How does the Central Server get the data provided by the Satellite Server from the database?
The Central Server would define two custom items for each Satellite Server.
-One to monitor the Watchdog to ensure that the Satellite Server _could_ send data if needed.
If there is now watchdog-record in the table within a given time, the Central Server would know that the connection is down.
-One to get Events out of the database in case the Satellite Server has fired a trigger. If there is no data, but the watchdog is ok, then everthing would be fine.
Otherwise the data would contain the Event with the corresponding state (Trigger On/Off). The state is needed to get notified if something bad works ok again.
Open questions in this draft
Q: Where to define the Id's for the Satellite Server.
A: Without modifications to zabbix, one option would be to add user on the Central Server and use that User/Password combos.
Q: How to pass data / watchdogs from the Satellite Server?
A: A simple way would be to use http/https and get/post via a simple script.
Q: How to encrpyt data/passwords being passed
A: Easiest would be to use https or use encryption of data with the scripts that put/get the data.
Note
Of cause, if zabbix provides such kind of functionallity in the future, all of this would become obsolete ;-)
Any comments?
based on another thread i was thinking about an extension to zabbix that would allow to
"Monitor a remote site reliable"
Idea
Basically we want to monitor a couple of sites connected via WAN-Links, VPN-Links or just the Internet (using an SSH-Tunnel or Stunnel etc.)
Usally we do not need to get access to all collected data 24/7, _but_ need to get informed if something goes wrong in one of these sites.
Lets assume that each site runs its own zabbix server with its own database and its individual items and triggers. This will keep that traffic for monitoring within the site.
Now we need a piece of *middleware* that does the following:
-A kind of Watchdog that monitors that the Zabbix-Server on Site is functional
-A kind of Watchdog that monitors that the WAN/VPN/Internet Link is functional
-A mechanism that forwards an event in case a triggers fires on that site.
In some cases it might be benefical, if a standard protocol like http can be used to transfer these data to avoid any firewall issues on site.
Also it would be great if zabbix wouldn't require patches.
Terminolgy
Central Server = Server that gets data (Watchdog and Events) from one ore more Satellite Server.
Satellite Server = Server on the site in question that passes Events to the Central Server if a trigger fires.
Watchdog = Mechanism to ensure that the Satellite has a connection to the Central Server.
Event = Data that is send from the Satellite Server to the Central Server because a trigger fires.
Draft
As outlined before, the zabbix Satellite Server would be configured as usal.
The only exception would be to:
-add a user item that validates that the connection to the Central Server is online (Watchdog).
This custom item can be a perlscript or shellscript or a simple wget that calls a specific webpage and passes ServerID+Password+DateTime+Status etc.
This webpage would be a php or perl script that passes the received data to a database (like mysql or postgre). Easiest would be a table within the zabbic database.
Adding also the local DateTime of the Central Server would allow to work even if the time between Central Server and Satellite Server would run out of sync.
-add a custom media that passes data to the Central Server in case a trigger fires on the Satellite Server. (Event)
If a trigger gets fired on the Satellite Server, in addition to the standard alerting procedure on that site, the resulting data would also be passed via a custom script (defined as an additional media) to the Central Server.
The mechanism could be very simlar to the way the watchdog is implemented.
How does the Central Server get the data provided by the Satellite Server from the database?
The Central Server would define two custom items for each Satellite Server.
-One to monitor the Watchdog to ensure that the Satellite Server _could_ send data if needed.
If there is now watchdog-record in the table within a given time, the Central Server would know that the connection is down.
-One to get Events out of the database in case the Satellite Server has fired a trigger. If there is no data, but the watchdog is ok, then everthing would be fine.
Otherwise the data would contain the Event with the corresponding state (Trigger On/Off). The state is needed to get notified if something bad works ok again.
Open questions in this draft
Q: Where to define the Id's for the Satellite Server.
A: Without modifications to zabbix, one option would be to add user on the Central Server and use that User/Password combos.
Q: How to pass data / watchdogs from the Satellite Server?
A: A simple way would be to use http/https and get/post via a simple script.
Q: How to encrpyt data/passwords being passed
A: Easiest would be to use https or use encryption of data with the scripts that put/get the data.
Note
Of cause, if zabbix provides such kind of functionallity in the future, all of this would become obsolete ;-)
Any comments?
Comment