ZABBIX Forums  
  #1  
Old 09-02-2011, 19:58
qix qix is offline
Senior Member
 
Join Date: Oct 2006
Location: Netherlands
Posts: 408
Smile Looking for a good story

Hello fellow forum members,

I plan to do a talk on Zabbix in the Netherlands in the near future. To spice things up a bit, I'm looking for a few good stories on where Zabbix saved the day in your shops.

I'm very interested in your stories!

Thanks all,
__________________
With kind regards,

Raymond

---
Online Zabbix tutorials:
http://www.zbxtutorials.org
Reply With Quote
  #2  
Old 09-02-2011, 21:10
preaction preaction is offline
Junior Member
 
Join Date: Feb 2011
Posts: 1
Default Zabbix Detected an Intrusion

At 3:00am early Thursday morning, Zabbix paged me that a server was down. This is not common, as zabbix has helped me get rid of all the bugs that cause server outages, but not uncommon either. So I check with Zabbix to see what was the matter.

After the DC reboots the computer, zabbix reports that the checksum of /usr/bin/sshd has changed. Unfortunately for me, I didn't realize quite what this meant at the time: Someone had replaced sshd with a version that copied passwords and tracked traffic to find the rest of our network. It being 3:00am, I didn't care and went back to sleep.

Later that day zabbix detected that our backups had not run (a simple custom trigger). The intruder had, for some reason, deleted root's crontab. After a few hours of e-mails and phone calls there was only one conclusion: Someone got in.

Zabbix knew which sshd had be replaced, and so knew which machines had been compromised. After rebuilding 13 machines from scratch over the course of the next 3 days, using more stringent sshd settings and upgrading from CentOS 3 to 5.4, we are, so far, clean.

Zabbix detected the intrusion before our moderately-knowledgable sysadmin (myself) understood it.

This is why Zabbix is necessary for your datacenter.
Reply With Quote
  #3  
Old 22-03-2011, 17:08
qix qix is offline
Senior Member
 
Join Date: Oct 2006
Location: Netherlands
Posts: 408
Default

Thanks!

No other stories from the community??
__________________
With kind regards,

Raymond

---
Online Zabbix tutorials:
http://www.zbxtutorials.org
Reply With Quote
  #4  
Old 26-03-2011, 12:32
0siris 0siris is offline
Member
Zabbix Certified Specialist
 
Join Date: Nov 2010
Posts: 76
Default just happened friday

Happened friday morning: I was in one of our off-site sever rooms, installing our new Zabbix sever (yes, 1.8.4 is really coming alive in april!), and adding some power supplies to another server. Several minutes later I got a phonecall about the server with the added psu's:

Zabbix delivered a text message, sever was unreachable. I checked and indeed, it was powered down, apparently I pushed the power button when sliding back the sever. So, quick zabbix critical message, quick power on, downtime minimized. Ooooooopssss...!
Reply With Quote
  #5  
Old 15-04-2011, 00:04
ilikejam ilikejam is offline
Junior Member
 
Join Date: Oct 2008
Posts: 9
Default

I might be a bit late to the party but...

A few years ago (Zabbix 1.4 was just out, if memory serves) one of my datacentre's two cooling units died, and the load on the remaining cooling unit was so high that it would shut itself down at random times (what sort of cooling unit shuts down when it gets too hot???).

We needed a way to alert the operations staff (or security guards if it was night) that the room temperature was too high, so we installed CentOS on a desktop machine in the office and yum'ed in Zabbix. Then we installed the Zabbix agent on a Sun V890 in the DC, and set up a UserParameter to grep the motherboard temperature from 'prtdiag'. Anything above 25 deg. C would set off the trigger.

The trigger ran an 'external command', which ran 'gammu' on the Zabbix server PC to send an SMS message to the Unix admin team's phones through an old Nokia mobile phone hanging off the PC's USB port. The Unix admins would then phone whoever they could find on-site to get them to open up the doors and let the heat out until the cooler kicked in again.

(The heat situation got so bad in that datacentre that we were getting ambient temperatures of 35 deg. C (!) - disk, tape and processor failures all over the place. We ended up smashing the datacentre windows out and setting up a load of desk fans in the aisles to vent the place until the cooling engineers could find parts for the HVAC units).

Shortly after that, Zabbix became the monitoring system of choice for us. I left that datacentre a few months later, but as far as I'm aware that poor old Nokia mobile is still there, hanging off a USB port on their new Zabbix server.

I hope they've taken my phone number out of the gammu script...
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 02:53.