Hello,
I have been monitoring a few of our servers with Zabbix for the past 60 days, and am growing to like the software. Basically, we have only been monitoring bandwidth, but are now looking to do much more with it.
I'm looking for another opinion and suggestions on bandwidth alerts. What I would like to do is, if incoming or outgoing traffic on a server is over a certain figure (say 100Kbit/s) - trigger an alert. I can see a handful of ways to do this, but trying to think of what might work best.
Some of my thoughts:
({host.domain.com:net.if.in[eth0].avg(300)})>100K
This should alert us if the average over the last 5 mins is over 100Kbit/s.
({host.domain.com:net.if.in[eth0].last(0)})>100K
This should alert us if the last recorded value is over 100Kbit/s.
Is anyone else trying to do something similar? Or Alexei, can you put some thoughts in here? Ideally, I'd like to know if bandwidth is hitting that "magic number" for a few minutes, as if it just hits it once.. it could be a legit spike. I also don't want to put *too* much load on the server by processing a bunch of averaging triggers.
Any advice is appreciated.
I have been monitoring a few of our servers with Zabbix for the past 60 days, and am growing to like the software. Basically, we have only been monitoring bandwidth, but are now looking to do much more with it.
I'm looking for another opinion and suggestions on bandwidth alerts. What I would like to do is, if incoming or outgoing traffic on a server is over a certain figure (say 100Kbit/s) - trigger an alert. I can see a handful of ways to do this, but trying to think of what might work best.
Some of my thoughts:
({host.domain.com:net.if.in[eth0].avg(300)})>100K
This should alert us if the average over the last 5 mins is over 100Kbit/s.
({host.domain.com:net.if.in[eth0].last(0)})>100K
This should alert us if the last recorded value is over 100Kbit/s.
Is anyone else trying to do something similar? Or Alexei, can you put some thoughts in here? Ideally, I'd like to know if bandwidth is hitting that "magic number" for a few minutes, as if it just hits it once.. it could be a legit spike. I also don't want to put *too* much load on the server by processing a bunch of averaging triggers.
Any advice is appreciated.
Comment