I have it installed and it seems to be working properly. I guess what I'm struggling with right now are some of the concepts to actually start monitoring. I'm trying to replace Cacti(graphing) and Xymon(graphing and alerting) with an all-in-one solution. Can anyone link me to some good guides for Zabbix 2.0? Specifically graphing and monitoring Windows Servers and Dell/Force10 switches? Thanks192.168.0.1 routerlogin 192.168.l.l
Ad Widget
Collapse
Zabbix: Can someone point me in the right direction for setting this up?
Collapse
X
-
Zabbix: Can someone point me in the right direction for setting this up?
Last edited by starsmixi42; 26-10-2020, 12:14.Tags: None -
Zabbix 2.0 is ancient. If you're just starting out, you should be starting with either the Zabbix 4.0 Long Term Support (LTS) or (preferrably) Zabbix 5.0 LTS. The 5.0 LTS is the latest major version and still might have some small "new version" issues, but many people here are using it very successfully.
For covering the basics, the Zabbix online documentation does a good job: https://www.zabbix.com/documentation/current/manual
Starting in that documentation with the "concepts" area will help.
You'll want to read carefully about "items", "triggers", and "actions". There's lots of other stuff you'll want to know to really use the software well, but those three areas are core concepts for the software.
Items are just collected data, like the 1 minute load average, the number of Apache httpd processes running, or a value from a temperature sensor. If the type of the collected data is numeric (integer, float), Zabbix will automatically generate rudimentary graphs for that data point. You can build customized graphs containing multiple items, tailored to your needs, but at the start the built-in automatic graphing is often enough.
You can collect textual or log data items too, they just don't get graphed automatically. How you collect the data (simple checks, using an agent, using SNMP) can vary on a host-by-host basis, and if needed you could even collect items using multiple methods, even per host.
Triggers are thresholds. If you just create items (to collect data) but never create any triggers, then you'll have historical information (metrics) for your systems, but you would never have any alerts generated, because it's the triggers that determine when an item is in a problem state. Triggers can be as complicated as you need to be able to determine "Is this value for this item a problem?". You can (but aren't required to) also specify recovery expressions, to use alternate logic from the trigger itself to determine when the problem has cleared. Recovery expressions and well-written triggers can mostly eliminate hysteresis / "flapping" .
Actions "do something" in response to the events that are generated by triggers. Typically that would mean alerting someone. Each user can have multiple media types defined, which are config that can be used by Actions to get alerts to the user, via methods like email, SMS, Slack integration, etc. If you configure it, your actions can also run automated remediation steps, escalate unacknowledged alerts to other support staff or managment, etc.
There's lots more to the software than that, but if you have a good handle on those topics and explore the other sections of the documentation, you should be able to make good use of the software.
The built-in official templates (especially the much-improved templates that are part of the 5.0 LTS) can serve as excellent examples of how to do lots of different types of items + triggers. The official templates tend to monitor a lot of stuff, so they can be overwhelming until you get a good handle on the core concepts, and for some sites they overdo data collection, but don't discount them as a good source of learning.Last edited by tim.mooney; 22-10-2020, 01:33.
Comment