Dear Zabbix,
Oh boy, you cant say I didn't try. I really really did. My whole team really really did.
I picked up on you from a Linux mag interview back in '05. Started using at 1.4 in production in '07. Eagerly awaited escalations in 1.6, and some needed improvements to DM in 1.8. It's not like I'm a newbie.
But things here aren't the same as the simplistic solution back in 2005. And as my infrastructure, and your aims and code, have grown, so have our problems my dear.
Distributed Monitoring. Ahhh DM. This love/hate relationship was always going to end in flames. Promising so much, and never quite delivering. Our first dalliance as a two node parent/child setup was a fun, fruitful time (aside from some minor issues if the child node has an ID > parent ID), but we've never got past that first base. three nodes, four nodes, all still leave two functioning fine, and the rest in a sort of limbo.
The lure of your centralised monitoring screen is fine, but really, have I ever been able to trust all those green squares? Is my far east server quiet because everything is fine, or is it just not sending anything? (Nope, its quiet because Zabbix is asleep on the job. No updates to the master, no emails out. After the first 5 hour outage unreported by zabbix, I'm dissapointed. After another 5 hour outage is reported by my customers not my monitoring, that's when I start to question our affair). So DM, temptress that you may be, you're just a nice idea that never really works. The Lindsay Lohan of monitoring?
And bugs. You are covered in bugs. Escalations that won't stop. Escalations that won't start. Escalations that jump right to the second step (ignoring all my oncall team, you just want to wake me at 2am you saucy tart). Issues with screens, issues with graphs (DM you make me feel dirty). I cant ack alerts from my DM nodes from their parents, I don't see alert data from the nodes. Why do they hide from me?
I promised I wouldn't go on and on.
I don't drop the five years with you lightly. As I said, I've given this my all. I've reinstalled, rebuilt, reconfigured, changed, upgraded until I'm blue in the face. I can compile, RPM, install and configure zabbix in my sleep. And my nightmares.
Bottom line? I can cope with the bugs on their own, they're an annoyance. I'm pretty pissed, but I can kinda live with the fact that the master monitoring screen only shows data from the child nodes, and I have to go to them direct to ack anything. Its another annoyance. But when I cant really, really, be 100% sure that what's shown there is accurate, I start to lose faith. And today, another very significant (and I mean server and all its' services totally dead. Even Servers Up would get this) outage reported to me by my users, not my monitoring, then we reached the bottom of my barrel of faith, perseverance, and tolerance.
Yeah, it looks lovely, it has great graphing and trending, the UI is fantastic but deep down, deep down at its heart, zabbix needs to monitor my kit, and tell me when its broken. Everything else is window dressing. And it just doesnt. Zabbix, FAIL.
I'm done for now. I'm going to rest my head on the pillow of the bitch whore Nagios. I'll pop back now and then to see how you are. Maybe one day you'll grow up and fulfil all those promises you made. But for now, sweet Zabbix, farewell.
Oh boy, you cant say I didn't try. I really really did. My whole team really really did.
I picked up on you from a Linux mag interview back in '05. Started using at 1.4 in production in '07. Eagerly awaited escalations in 1.6, and some needed improvements to DM in 1.8. It's not like I'm a newbie.
But things here aren't the same as the simplistic solution back in 2005. And as my infrastructure, and your aims and code, have grown, so have our problems my dear.
Distributed Monitoring. Ahhh DM. This love/hate relationship was always going to end in flames. Promising so much, and never quite delivering. Our first dalliance as a two node parent/child setup was a fun, fruitful time (aside from some minor issues if the child node has an ID > parent ID), but we've never got past that first base. three nodes, four nodes, all still leave two functioning fine, and the rest in a sort of limbo.
The lure of your centralised monitoring screen is fine, but really, have I ever been able to trust all those green squares? Is my far east server quiet because everything is fine, or is it just not sending anything? (Nope, its quiet because Zabbix is asleep on the job. No updates to the master, no emails out. After the first 5 hour outage unreported by zabbix, I'm dissapointed. After another 5 hour outage is reported by my customers not my monitoring, that's when I start to question our affair). So DM, temptress that you may be, you're just a nice idea that never really works. The Lindsay Lohan of monitoring?
And bugs. You are covered in bugs. Escalations that won't stop. Escalations that won't start. Escalations that jump right to the second step (ignoring all my oncall team, you just want to wake me at 2am you saucy tart). Issues with screens, issues with graphs (DM you make me feel dirty). I cant ack alerts from my DM nodes from their parents, I don't see alert data from the nodes. Why do they hide from me?
I promised I wouldn't go on and on.
I don't drop the five years with you lightly. As I said, I've given this my all. I've reinstalled, rebuilt, reconfigured, changed, upgraded until I'm blue in the face. I can compile, RPM, install and configure zabbix in my sleep. And my nightmares.
Bottom line? I can cope with the bugs on their own, they're an annoyance. I'm pretty pissed, but I can kinda live with the fact that the master monitoring screen only shows data from the child nodes, and I have to go to them direct to ack anything. Its another annoyance. But when I cant really, really, be 100% sure that what's shown there is accurate, I start to lose faith. And today, another very significant (and I mean server and all its' services totally dead. Even Servers Up would get this) outage reported to me by my users, not my monitoring, then we reached the bottom of my barrel of faith, perseverance, and tolerance.
Yeah, it looks lovely, it has great graphing and trending, the UI is fantastic but deep down, deep down at its heart, zabbix needs to monitor my kit, and tell me when its broken. Everything else is window dressing. And it just doesnt. Zabbix, FAIL.
I'm done for now. I'm going to rest my head on the pillow of the bitch whore Nagios. I'll pop back now and then to see how you are. Maybe one day you'll grow up and fulfil all those promises you made. But for now, sweet Zabbix, farewell.
Comment