I hope this post serves as a dual-role benefit to those reading it. Especially as a complete newbie to Zabbix & Linux programming.
The first being to benefit on how to utilize Zabbix to perform a - what I perceive to be - simple task of network monitoring.
The second is so that those following can learn from the (Hopefully) numerous posts that will go on with this thread.
OK - Here goes nothing!
The Scenario:
You have a WAN network consisting of one workstation, at one location and you have 1200 locations.
The IP addresses at each location is static and the workstations are nothing more than just operating as a standard network share for local users - Nothing more, nothing less. The only requirement for these workstations is that they are NOT to be powered off / taken off the network at ANY given point in time. They are to remain online 24x7x365.
The problem:
At the present time, there is NO monitoring what-so-ever on these devices as to when they go down. Furthermore, there is NO logging of when the device went down (and if / when it came back up) to catch the "repeat offenders". Some scenarios include circuit breakers being tripped, office relocation, cleaning within the office disconnected network cable, uplink switch failed, etc.
The solution:
I would like to utilize a network monitoring software that does the following:
1. I would like to utilize a “PING” like function as the only method to see if the host is alive.
2. Polling can be done every 10 minutes as this is not a true mission critical device.
3. There should be at least 2 or 3 attempts to await for a reply from the device before reporting a "DOWN" state.
4. If the site times out, an email alert should be emailed as well as for when the site comes back online.
5. Provide a "real-time" web view of ALL devices whether offline or online.
6. Lastly, there should be a log file which retains all of the up / down status changes for ALL these sites to locate the repeat offenders.
In short, a simple network monitor without utilizing all the bells and whistles that SNMP offers. Let's be honest, at the end of the day - we as network administrators can be as proactive as possible to avoid downtime, however, that creates for this perception that nothing should EVER go down.
In my opinion, while that is true - we must be reactive and responsive as to when it does go down and addressing the issue immediately.
I hope this post creates quite a stir & following to learn from and get some positive feedback.
Thanks,
Vince
The first being to benefit on how to utilize Zabbix to perform a - what I perceive to be - simple task of network monitoring.
The second is so that those following can learn from the (Hopefully) numerous posts that will go on with this thread.
OK - Here goes nothing!
The Scenario:
You have a WAN network consisting of one workstation, at one location and you have 1200 locations.
The IP addresses at each location is static and the workstations are nothing more than just operating as a standard network share for local users - Nothing more, nothing less. The only requirement for these workstations is that they are NOT to be powered off / taken off the network at ANY given point in time. They are to remain online 24x7x365.
The problem:
At the present time, there is NO monitoring what-so-ever on these devices as to when they go down. Furthermore, there is NO logging of when the device went down (and if / when it came back up) to catch the "repeat offenders". Some scenarios include circuit breakers being tripped, office relocation, cleaning within the office disconnected network cable, uplink switch failed, etc.
The solution:
I would like to utilize a network monitoring software that does the following:
1. I would like to utilize a “PING” like function as the only method to see if the host is alive.
2. Polling can be done every 10 minutes as this is not a true mission critical device.
3. There should be at least 2 or 3 attempts to await for a reply from the device before reporting a "DOWN" state.
4. If the site times out, an email alert should be emailed as well as for when the site comes back online.
5. Provide a "real-time" web view of ALL devices whether offline or online.
6. Lastly, there should be a log file which retains all of the up / down status changes for ALL these sites to locate the repeat offenders.
In short, a simple network monitor without utilizing all the bells and whistles that SNMP offers. Let's be honest, at the end of the day - we as network administrators can be as proactive as possible to avoid downtime, however, that creates for this perception that nothing should EVER go down.
In my opinion, while that is true - we must be reactive and responsive as to when it does go down and addressing the issue immediately.
I hope this post creates quite a stir & following to learn from and get some positive feedback.
Thanks,
Vince
Comment