Zabbix has so many features to love, but something I cannot stand is how there isn’t a great way to get a quick overview for a group of hosts. This is preventing me from converting to Zabbix from a Xymon system. Obviously, not everybody will agree on this opinion, but I’ve seen enough comments on the matter to know I’m not alone.
Multi-state Triggers
One of the big reasons the overview page becomes a totally useless mess is due to the fact that triggers only have one state. I won’t argue about how flexible the trigger system in Zabbix is because it IS very flexible. However, the system would benefit greatly from triggers with multiple states. You could also approach this issue with ‘grouped triggers’… and no, dependencies aren’t the solution.
Say I want to configure three thresholds for the amount of drive space available on a volume. I have to create 3 triggers. Not a big deal, I suppose. But if I have three volumes to monitor, now I have 9 triggers. The problem is every one of these is added to the overview table. Add to this the fact that some servers in the group may have different volume names and you’ve just added even more columns. And this is just dealing with one monitored metric!
We don’t need to cut down on the number of triggers, we just need to rethink how Zabbix deals with them. We should be able to create a single trigger (or again, call it a trigger group if you want to) and then inside of it define the triggers (thresholds) that set the various states. The triggers could also apply based on the severity - that is to say, if the “critical” threshold/trigger is met, ignore the “warning” threshold/trigger. (this is probably implied but I wanted to make it clear)
Example: Multi-state trigger
Trigger: % Used space on C:
State 1 - OK: If 60% or less
State 2 - Warning: If usage is > 60%
State 3 - Critical: If usage is > 90%
When viewing this improved overview table I could either hover or click on the item in question to get more details.
Application States
First, for this to work optimally, Applications would have to become more flexible by allowing the same application to be used by two or more templates applied to single host. This has been asked for before, as far as I can tell.
The general idea here is to further “zoom out” the overview of a group of hosts and let us quickly know the health of various applications. This might be more important for IT managers than the actual nuts and bolts guys like us who actually have to fix the issues. We want to know the details of what is happening, but a lot of people need/want a high level overview. If we could use the multi-state triggers I mentioned above to set the “state” of an application, we can create an uncluttered easy to high level overview. I’m not sure of the best way to implement this, but one idea would be to have an option for each of the states in a trigger to set the state of an application as well.
Reusing my trigger example above:
Example: Multi-state trigger with application states
Trigger: % Used space on C:
State 1 - OK: If 60% or less (Set application Volumes to OK)
State 2 - Warning: If usage is > 60% (set application Volumes to Warning)
State 3 - Critical: If usage is > 90% (set application Volumes to Critical)
Another great example for this idea would be services (Windows services in my case). Currently, every single service we are checking the status of and triggering on is displayed in the overview table. It makes so much more sense to simply have a “Services” column who’s status is set based on the status of services being monitored.
Multi-state Triggers
One of the big reasons the overview page becomes a totally useless mess is due to the fact that triggers only have one state. I won’t argue about how flexible the trigger system in Zabbix is because it IS very flexible. However, the system would benefit greatly from triggers with multiple states. You could also approach this issue with ‘grouped triggers’… and no, dependencies aren’t the solution.
Say I want to configure three thresholds for the amount of drive space available on a volume. I have to create 3 triggers. Not a big deal, I suppose. But if I have three volumes to monitor, now I have 9 triggers. The problem is every one of these is added to the overview table. Add to this the fact that some servers in the group may have different volume names and you’ve just added even more columns. And this is just dealing with one monitored metric!
We don’t need to cut down on the number of triggers, we just need to rethink how Zabbix deals with them. We should be able to create a single trigger (or again, call it a trigger group if you want to) and then inside of it define the triggers (thresholds) that set the various states. The triggers could also apply based on the severity - that is to say, if the “critical” threshold/trigger is met, ignore the “warning” threshold/trigger. (this is probably implied but I wanted to make it clear)
Example: Multi-state trigger
Trigger: % Used space on C:
State 1 - OK: If 60% or less
State 2 - Warning: If usage is > 60%
State 3 - Critical: If usage is > 90%
When viewing this improved overview table I could either hover or click on the item in question to get more details.
Application States
First, for this to work optimally, Applications would have to become more flexible by allowing the same application to be used by two or more templates applied to single host. This has been asked for before, as far as I can tell.
The general idea here is to further “zoom out” the overview of a group of hosts and let us quickly know the health of various applications. This might be more important for IT managers than the actual nuts and bolts guys like us who actually have to fix the issues. We want to know the details of what is happening, but a lot of people need/want a high level overview. If we could use the multi-state triggers I mentioned above to set the “state” of an application, we can create an uncluttered easy to high level overview. I’m not sure of the best way to implement this, but one idea would be to have an option for each of the states in a trigger to set the state of an application as well.
Reusing my trigger example above:
Example: Multi-state trigger with application states
Trigger: % Used space on C:
State 1 - OK: If 60% or less (Set application Volumes to OK)
State 2 - Warning: If usage is > 60% (set application Volumes to Warning)
State 3 - Critical: If usage is > 90% (set application Volumes to Critical)
Another great example for this idea would be services (Windows services in my case). Currently, every single service we are checking the status of and triggering on is displayed in the overview table. It makes so much more sense to simply have a “Services” column who’s status is set based on the status of services being monitored.