Ad Widget
Collapse
Call for discussion on ZABBIX 1.6
Collapse
X
-
-
Like others I would like to add my gratitude to the Zabbix developers.
Now to the features:
- Functional Items - items based upon the function of two or more other items
- Better Jabber Support - Although we don't get errors shown in the event log (IO errors) we still only ever get Jabber notifications just after Zabbix server has started, they soon stop appearing and never return.
- Improved IT Services - As many have asked and queried about, this is a winning feature but seems very difficult to use
- Improved Web Monitoring - In desperate need of more functional and state control in web exchanges. i.e. the ability to access and return state returned from one web call to another.
- New Template: Java/JMX Monitoring - These should be pulled into the core by now.
- New Template: Application Servers - A set of common application server monitoring templates should be in the core.
Comment
-
'screen' section per template
Hi,
I've just started exploring zabbix and I'm very pleased with the cloning function of hosts and triggers. Also the use of templates for grouping of graphs, triggers etc. is very nice. Something I'm missing is a similar function within the screen section.
It would be nice to have a screen layout cloned with just the hostname of graphs changed. (or even better, add the screen to a template and enable host selection)
When this would be possible I at least would be very pleased.
Comment
-
UserParameter autodiscovery
The agent could reply to a "list" command by providing the full listing of the configured User Parameters and wich of them are working well (giving out a value)
Then, the Zabbix server could easly notify the user if there are some User Parameters installed on an host that are not used in any Item.
Eventually the server could automatically re-enable Items that were marked as "Not supported", or add new Items as well.Comment
-
I'm implementing Zabbix at our site to monitor about 40 systems.
Several of our systems are located in a remote datacenter, so I cannot do autodiscovery without making the datacenter's netadmins go wild.
Suggestion:
Add an option to the auto-discovery configuration to make the rule "passive". i.e. don't automatically scan the whole subnet, but instead only trigger discovery for a single IP whenever that host tries to contact the zabbix server (i.e. when the agent is started for the first time)
I'd like to improve the integration between zabbix and our configuration management, but I'm missing an API, CallBacks and Plugin-Interfaces.
Suggestion:
- provide an API to do what the web-interface currently does and move the web-interface to use the API.
- add Callbacks/Actions/Events (you name it) to specific events (like add/del/change of a host, item, trigger, etc so we can actually inform other systems that something in zabbix changed
- add a simple way to add stuff to the web-interface - a "drop-php-file-here"-Plugin system would really help to extend zabbix
I'd really like to use the host profiles, but they don't actually provide the fields I need
Suggestion:
remove all the value stuff from the host_profile table. Provide a simple key - value table instead where everybody can define his own set of keys.
Maybe also add a way to provide a datasource (i.e. item) for each of these keys, so you could actually have current inventory-data collected by zabbix (stuff like "vm.memory.size / 1048576" comes to my mind).
Sometimes some of our custom checks go to "unsupported" state after something weird on the monitored system happens that confuses the script, we'd like a notice in these cases
Suggestion:
add an event to make it possible to write a trigger that notifies if an item switches to "unsupported"
We use vfs.fs.size to monitor filesystem utilization. Because not all of our systems have the same filesystem layout, we sometimes have issues with that. I think the behaviour could be easily improved.
Suggestion:
make vfs.fs.size return "unsupported" if the directory specified is not a mount-point, so if I monitor /var, /usr and /home and all of these directories are simply on the / filesystem, I'd get 4 notifications instead of 1, because I monitor the same filesystem four times (as /, as /var, as /usr and as /home).
Together with the above suggestion (notify if item switches to unsupported), you'd even notice if a monitored filesystem was unmounted.
We make heavy use of templates and encountered the following issue:
say you have the templates mysql_t and httpd_t to monitor mysql and httpd. Then you create a template zabbixserver_t that is linked to mysql_t and httpd_t and another template like mediawiki_t, that is also linked to mysql_t and httpd_t.
Now if you try to assign zabbixserver_t and mediawiki_t to a single host, you'll get a message that the templates are incompatible.
Suggestion:
I guess the linking needs a bit of work. Maybe this scheme would help:
for each template that needs to be linked:
- recusively collect templates that are already linked
- recusively collect templates to link now
- calculate difference
- only add the items/triggers/stuff that are linked directly (i.e. not through another template) to the templates that weren't added so far
Ah, and btw. I see many people who demand you do write weird auto-update and check-distribution stuff for the agent.
Short answer: don't do it
Loger answer:
Updating the agent, its configuration and its scripts that's the job of the system's patch and configuration management. If you want configuration management implement one, but zabbix is for monitoring, not configuration management.Comment
-
about php embeding in zabbix...
and
i've posted a patch to add php support natively in zabbix server....
that enable zabbix to call php code directly without forking...
for the moment only added in item poller... but can be added to discovery/trigger in the same way...Comment
-
Sizing (large environments)
Hi
we are trying ZABBIX in an environment with a large number of servers we operate for several customers.
Although we have not installed it (1.4.2) "everywhere", yet, we are pleased with
the stability of the agents (Windows and Solaris 8/9) and the server
(PostgreSQL, FreeBSD 6.2, etc.).
We need distributed monitoring, of course.
That means:
- agent configuration is done on one central server and this is propagated via ZABBIX server slaves to the agents
- triggers are forwarded from the slaves to the master(s)
We don't need all the data on the central server, but it is still
necessary to configure the slaves to send the data to a certain master.
In some cases we will have more than one ZABBIX-Server per customer,
so we need a sort-of tree of Servers.
That means, some slaves need to propagate the data from the agents
to a master while others need not to do it.
Summary:
- configuration information goes "down" via all slaves to the agents
- triggers go "up" from all slaves up to the root of the ZABBIX-Server tree
- data go "up" from a slave to a certain master, but only if configured
I guess this will allow a better scaling of the ZABBIX distributed monitoring
than the current (1.4.2) version.
Another important point is the filtering of the Windows event log.
We don't need all eventlog entries of all monitored Windows servers on our
central server.
First of all, just the ones we are interested in (to calculate the triggers) and
second, as with the data mentioned above, just up to a certain master in
the tree of ZABBIX Servers.
If this sounds confusing, just reply or send a PM for more detailed discussion.
I hope, this will be included in 1.6 !
Regards,
Norbert.Comment
-
SNMP improvements
SNMP support needs to be vastly improved.
Should be able to make use of SNMP indexes.
automatically discover mounted partitions (with the ability to update in future)
automatically discover ethernet interfaces (with the ability to update in future)
SNMP walks
Present a web page that would allow one to check of which snmp walk data to add to host/template
SNMP per host community string override and port overrideComment
-
Stacked Graph Improvements
- Ability to overlay non-stacked items as lines on top of a stacked graph of filled items. One use of this would be to graph free memory, buffer memory, cached memory as stacked and then draw a line across the top of the graph (unstacked) for total memory.
- Ability to stack downwards from a fixed point or an item value. That way you could start at total memory and stack downwards for free, buffer and cached memory.
- Set the Y axis to an item value rather than a fixed value. Again, set the scale of the graph to total memory as actually read rather than having to have different templates for 4G, 8G, 12G, 16G boxes, etc...
- And I'd like to be able to associate charts with templates and create one-page views of single hosts as well, with a single administrative point to manage them.
Munin does all of this and the whole point is to try to kill munin use cases.Comment
-
i like the first idea (counting quoted ones...
- Set the Y axis to an item value rather than a fixed value. Again, set the scale of the graph to total memory as actually read rather than having to have different templates for 4G, 8G, 12G, 16G boxes, etc...
- And I'd like to be able to associate charts with templates and create one-page views of single hosts as well, with a single administrative point to manage them.
), but i'm not sure what you mean with the second one.
if by charts you mean graphs, you can assign graphs to templates for quite some time now.
as for the second part, if you mean something similar in functionality to "screen templates", that should appear soon
Comment
-
I believe I mean "screen templates"...i like the first idea (counting quoted ones
), but i'm not sure what you mean with the second one.
if by charts you mean graphs, you can assign graphs to templates for quite some time now.
as for the second part, if you mean something similar in functionality to "screen templates", that should appear soon
Right now you can associate graphs and items and triggers with a template and have those graphs and items and triggers show up on associated hosts. I'd like the same for screens. That way I setup a template and give it a screen containing graphs for cpu, memory, disk, etc and I wind up with individual per-host screens for all my associated hosts getting automagically created...Comment
-
i've got another request:
i'd like to be able to request that polling intervals (aka item 'delays') be able to be capped so that they occur no more often than every 60 seconds. ideally i'd like this enforced both in the API so that users couldn't attempt to set more aggressive polling intervals, and in the server so that database entries that were more aggressive were 'rounded up' to 60 seconds.
similarly, i'd like to be able to globally cap history to no more than 7 days.Comment
-
Agent protocol/error handling improvements
A suggestions/feature-requests on zabbix_agentd and server status/queue improvements:
Monitoring queue tracking information. We need the ability to detect when the queue is "behind schedule", i.e. passively monitored resources via agentd or SNMP have a missed polling schedule, or actively-sent agentd metrics are not received. Currently, tuning the number of threads for trapper/poller processes is hard to do, since no data exists on WHEN any metrics are missed as it is, I know there's a problem and I have too many or too few threads when browsing graphs I see periodic data points missing.
If monitored systems are missed, because not enough poller threads are running to keep the queue satisfied, or not enough trapper threads are running to satisfy incoming packet rates/clients, it needs to be manifested somewhere. As it is, I'm trying to debug issues across a thousand nodes by setting up 'nodata' triggers on metrics to see if they're being missed! Not feasible in production.
BTW, excellent progress in 1.4.3, this just fixed two bugs I was reporting today!
Cheers,
/eliComment
-
I don't know if this a 1.6 request, or a 2.0 feature request...
It'd be nice to have a my.zabbix functionality so that users could login and setup their own custom graphs and charts without polluting other users views, but be able to bookmark and share the information that they find most useful.Comment
Comment