Feature suggestions should be added and described here. Description should be as specific as possible, to the point of being usable as a development specification. Implemented items should be moved to the appropriate section, noting, if possible the svn revision that added the feature to trunk and first stable version to have that feature. Description should be contained on this page, information from forum posts should be moved here & improved upon.
Each feature request should contain, if applicable, exact locations in the GUI and Zabbix version upon it is based.
If possible, scope should be evaluated both as in work required (for potential funding), and as in “can this go into stable branch or will it go in trunk”.
New features should never call for implementation of non-free technology (think flash). Additional dependencies should be carefully evaluated.
If the description is large enough, such feature specification should get it's own page, which in turn would be linked from this page.
It would be useful to see contact information on triggers page. This information would be taken from Inventory for that host (or any list of arbitrary inventory fields, like 'owner, responsible for, warranty by').
To be even more useful, owner/responsible for would be assigned to applications, thus allowing to display correct contact for hardware, os and application related problems.
Location : Monitoring → Triggers
Hosts and triggers info screen elements (also includes 1.6 dashboard's “system status”) should have clickable categories. Clicking on trigger severity category would open trigger view filtered by severity. For example, clicking on “3 High” would open trigger list, filtered to show only high severity triggers.
Clicking on hosts (for example, “2 Not available”) would open – what ? Should that be an existing page or should there be some new host listing monitoring view ?
Location : Monitoring → Screens
Frontend should query zabbix server & show what features are supported (ipmi, snmp, jabber etc) - or server could set these values in the db which gui in turn reads.
Overview is hard to use when there are a lot of hosts/items. To solve this problem, expandable grouping could be used by :
If data and trigger in a graph lines match (for example, if data line is all “1” and trigger line also is drawn at 1), such a graph is hard to read. There should be developed a way to draw these lines differently.
Dashboard does not list 'OK' triggercount, while others do – probably should be somehwhat unified;
Dashboard has reversed order when compared to other locations, probably should be unified.
If two triggers are in ON state and one depends on another, the depending trigger is not shown in the trigger list. If both triggers go OFF, they both are displayed in trigger list. In such a case, the depending trigger should not pop up in the trigger list flashing. This should be like that in all locations where triggers are shown. Location : Monitoring → Triggers
If there are enough items in a graph, it soon becomes very hard to understand, especially for items with similar colours.
To ease viewing of such graphs, an interactive mouseover effects would help a lot.
Two modes would indicate the interesting items :
Location : Monitoring → Graphs
Graph legend would work as a switch for the associated item in the graph. Clicking on a legend box would hide/show the item from the graph.
Location : Monitoring → Graphs
The current graph rendering in Zabbix leaves something to be desired. An improvement would be SVG graphs, for examples of this see PlotKit, SVG::Graph or the bandwidth graphs in the openwrt distribution Tomato. Using SVG graphs would ease the implementation of the Dynamic Graph suggestions above.
Location : Monitoring → Graphs
This single package is considered above all others. Oracle has selected it as the core of their Big Data Statistical Products.
This will single out Zabbix above all othe monitoring platforms.
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.
One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
R is available as Free Software under the terms of the Free Software Foundation's GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.
When a trigger is added or it's expression is changed, such a trigger should not appear flashing in monitoring section. Especially annoying for helpdesk display when new hosts with a lot of triggers are added or existing triggers are changed.
Also, some UNKNOWN → FALSE scenarios make the trigger show up flashing. Only TRUE → FALSE should pop such triggers up.
Location : Monitoring → Triggers
There should be a user profile setting for FALSE/TRUE flashing trigger period. Zero efficiently disables flashing for that category. Location : Profile Location : Monitoring → Triggers
In some cases, triggers that are enabled by single SNMP trap don't go off because the disabling trap has been lost for some reason.
It would be useful to have simple “Disable trigger” button/link in trigger list.
It might be reached by a popup menu on status field, or have it's own column.
Location : Monitoring → Triggers
It would greatly improve appearance of Zabbix web frontend, if it was able to transfer information in background and update page elements inline.
For example, even a large screen with many objects would not flicker or change layout if each object was downloaded in background and replaced after it is fully retrieved.
Similarly, there would be no need to regenerate column heading s in overview and other non-dynamic parts, if these pages were able to obtain only changed data (values, trigger status and other, significant and dynamic content).
Location list where this matters :
As with items, hosts should be collapsible and expandable in latest data.
Location : Monitoring → Latest data
There should be an overview mode where only true items are shown. This would mean that only columns and rows that have at least one true cell are displayed.
This should work both for triggers and items.
It should be possible to set this mode when adding overview element to screen.
There should be a way to set map colours according to trigger severity and not to trigger value colour. Should this be new default or should it be configurable per object ?
If there are a lot screens/maps, they are all tucked in a single dropdown menu. There should be some way to group & drill-in select them.
While current dashboard can be somewhat customised, that's not enough.
It would be awesome if dashboard was user-customizable, some sort of a screen-dashboard mutation.
This would also include ability to create multiple dashboards and link one to another (creating a dashboard item where problems per server group are listed, and clicking on one of the displayed items would bring up overview of this group etc).
In slideshow, buttons 'back', 'pause/resume', 'forward'.
Access to IT services should only be allowed according to user permissions.
When a graph is created, fixed amount of proportionate timestamp periods are used (24?), thus they mostly do not fall onto hourly, dayly, weekly, monthly etc border and end up being something like “02.10 11:32”. It would be nice if timestamps were adjusted according to scale, for example - hourly graph had a timestamp each 5 minutes, 2-3 hours graph each 10 minutes, 4-8 each 15 minutes and so on (these examples are not calculated to fit nicely).
Discussion at http://www.zabbix.com/forum/showthread.php?p=42748
Zero times are just “visual hints”. They're vertical red lines to mark day beginning in daily graphs, monday 0:00 in weekly graphs, 1st of the month at 0:00 in monthly graphs and so on.
Period compare triggers would allow a trigger to evaluate and compare data from different time slices of the same item. i.e. a trigger that compare the last hour average traffic of an item with the average hour from the same item one week ago.
Current implementation requires manual entering of object coordinates when creating or editing a map.
This should be improved to allow positioning objects by dragging them.
Location : Configuration → Maps → Edit (of an existing map).
When you add an item it is better to get Units of a menu (with all supported Units)
Columns should be wrapped in host groups view.
Location : Configuration → Hosts → Host groups
Location : Configuration → Hosts → Template linkage
(maybe also 'Templates' ?)
Basic : make 'Edit' links in map and screen configuration same style as all other links – that should make them stand out a little bit better.
Advanced : Move 'Edit' to the left hand side and maybe create two links 'Properties' (works the same as currently clicking on screen/map name) and 'Edit' ? Or upon clicking on name present a popup menu with 'Properties' and 'Edit' ?
Location : Configuration → Screens
Location : Configuration → Maps
Ability to import/export any object would help in many cases (simple backup, complex or tedious modifications etc).
Sections of specific interest should be mentioned below.
= Screens =
= Connected applications and values throw mappings =
If exported items have applications/value mappings linked, these should be appropriately exported and correctly imported later.
If an entity is linked into host or template from another template, some parameters can be changed for that child node independently (for items – interval, history, activation etc; for trigger – dependencies, expression, name, severity etc).
Unfortunately, most of the changes upstream null the changes downstream, it is also near impossible to find out which entities have been modified downstream. This makes template usage more complicated and inconvenient.
To improve the situation, the following changes are suggested :
If an entity is changed downstream, it should be possible to “link” individual fields and whole entity back to the upstream entity (for example, configuring a trigger should allow to “link” back individual fields like name/expression, opening item/trigger list in configuration should allow ”linking” back individual entities.
If imported trigger has the same name, but different expression, a duplicate trigger is created. It would be useful to have Add/Update/Skip for triggers when importing.
There should be a way to add multiple items to graph. Items could be displayed in normal list, marking them with checkboxes and clicking 'add' button would add them all.
What about colours, should there be some sort of preferred colour order (like OpenOffice.org chart has), where colours are applied from this colour list to the added items (skipping already present colours in graph) ?
In configuration, there are several separate pages that contain one or two options. These should be unified in single page to reduce jumping around in the GUI.
The following Configuration → General sections should be unified:
Location : Configuration → General
Currently each graph item has to be opened for configuration individually. Graph item configuration should be moved to the main graph configuration window. For this, table titles should be added, and one additional parameter added to the main screen - Y axis side.
While popup could be used for new item adding, that also could be handled by adding a new line to the item table & allowing editing of parameters right away.
Currently, refreshing a page after doing some configuration changes attempts to redo these changes - for example, add a new host, then refresh the host list. Frontend attempts to add the host again. Maybe similar solution can be used as in most forums/comment sections, where refreshing the page after adding a new comment does not add another copy of it.
A central template storage that can be accessed by Zabbix directly. Templates could be downloaded and installed from Zabbix interface directly. Templates could be searched and listed from within Zabbix interface. Templates could have comments and status (for example, “verified”, “confirmed” and “unknown”). If a template includes other templates, are linked-in templates are automatically installed.
Currently, most periods in Zabbix frontend can only be set in seconds (item refresh period, auto-logout in user profiles etc). This requires user to calculate period each time manually. It would be nice to have a widget that allows to easily set seconds, minutes, hours, days, weeks, months… It could as well be simple dropdown next to the input field that is used by all fields where period has to be inserted (including item history and trend keeping etc).
Multiple sections of GUI would benefit from ability to set arbitrary time period, not only reporting (thus this can be considered a bit miscategorized). Specific locations :
There should be more internal linking to allow easier navigation
Location : Configuration → Hosts → Template linkage
Location : Configuration → Hosts → Host groups
Trigger and event pages should provide popups with links to graphs (what graphs ? Simple graphs of all items involved, all custom graphs involved ?).
Reasoning – when looking at triggers and events page, it is quite often desirable to view more contextual information for the particular event. Currently, that means manually looking for the host in corresponding pages.
Location : Report → Most busy triggers (both hosts and triggers)
Location : Monitoring → Triggers
Location : Monitoring → Events
Location : Administration → Users (groups)
Location : Administration → User groups (members)
Current slideshow implementation allows to cycle through screens only. This can make it cumbersome in some cases to create a screen containing a single element only to include in a slideshow.
Slideshows should allow adding as slides elements same way screens do.
Location : Configuration → Screens → Slideshows
It should be possible to add Monitoring → Triggers or Monitoring → Dashboard → Last N issues as a screen element, with all the accompanying options.
Location : Configuration → Screens → Screens
There should be an ability to add multiple labels to map objects. Each label could be a custom string, object IP address or DNS name, in any order. Thus it would be enough to change host IP in host configuration and it would change automatically in the map. Label customisation probably is best done supporting macros in them.
Location : Configuration → Maps
There should be bash-like variable expansion so that elements can be mangled. For example, if routers are named “Location_router”, on a router map one would insert name as {HOSTNAME%_router}, which would turn the name into “Location”.
When creating/editing a graph, there should be an ability to add an item from the template, thus obtaining items from all hosts that the template is linked to in the graph. There should be an option to limit items added this way to only single group.
If a host is removed from the group or added to it, graphs created this way should be automatically updated.
Location : Configuration → Graphs
It should be possible to set a slideshow as a screen element.
This would allow, for example, creating an overview screen that had generic part at the top and different detailed statistics changing at the bottom.
This would probably be most useful with inline background object updating.
Location : Configuration → Screens → Slideshows
There should be an ability to autoscale objects and object collections like graphs, screens, maps etc.
This would probably have to be implemented both client and serverside, where server generates initial images according to client resolution at the moment of the request, and client would rescale those images later, if needed.
This would probably require properties for items like “do not scale horizontally/vertically” - or maybe for a greater flexibility minimal/maximal scaling limits for both vertical/horizontal changes.
As a subset, when creating items, their size would be suggested based on client resolution.
Rationale – it is both very cumbersome to adjust item sizes by pixels until they are more or less right, and layout still breaks or overflows screen on different resolutions or if simple things like window border or decorations change.
Actually, already implemented for graphs, probably should be similarly implemented for maps & screens. Screen items should resize proportionally (with min/max scaling limits ?).
Currently, audit only notes that an item has been changed. It should be more verbose and show which part of the item got changed, maybe even 'from' and 'to' values.
Audit log should show IP address for user logins.
Entries “Host Updated Old status [1] New status [0]” should show actual hostname, and maybe a description of what 'status' means.
Frontend should be able to set the Y axis to an item value rather than a fixed value. Scale of the graph could be set to total memory as actually read rather than having to have different templates for 4G, 8G, 12G, 16G boxes, etc…
This could be both derived from other item or a special category (“memory” etc). Derived might not be so automatic, but much more flexible, so probably is the way to go.
It should be possible to disable legend for individual graphs.
It should be possible to customise legend (how exactly ?).
STOP tracking client status in a not-per window fashion. Opening several hosts/templates in different windows/tabs very often causes incorrect host to be opened in another window and sometimes operations are performed on another hosts.
Scenario : open items for one host, open items for another host in a new tab, refresh first tab. It changes to the second host – should not happen.
Status should be tracked only for the existing execution path, maybe even everything in GET variables to ease linking.
It would be useful to create graphs from trigger states - sort of like an enchanced availability report/SLA.
For example, a set of triggers on a network interface:
Then a pie (or any other type) chart could be created based on how much time that trigger spent in any of the 4 states.
Permission model should combine available permissions, instead of using first encountered match. For example, if user belongs to two groups and one has r/o access to a resource, while other has r/w, user should get r/w access in all cases to this resource.
Graph naming in Zabbix currently is non-flexible and often graph title is not quite correct. To improve this, there should be a way to set custom graph titles that would also support macros. One macro possibility :
In a template with a graph would be named ”[22] Traffic ({:IF-MIB::ifAlias.22.last()})”, which would use the last value from the corresponding item.
Setting such a macro on a template, there is no hostname, in which case current hostname (of the graph displayed) is used. Setting a hostname will retrieve value from that host.
Proof of concept patch at http://www.zabbix.com/forum/showthread.php?p=38796
References:
http://www.zabbix.com/forum/showthread.php?t=5720 http://www.zabbix.com/forum/showthread.php?t=5951
Agent should support natively more of the native subsystems/utilities used for monitoring and data gathering.
It should be possible to specify a different set of server addresses to be used as active servers. Problem - if servers send different lists of items to check, which one is authoritative ?
References:
http://www.zabbix.com/forum/showthread.php?t=5655
http://www.zabbix.com/forum/showthread.php?t=5666
Logging of zabbix_server process should be improved to provide more meaningful information in log messages. For example, “2903:20070907:185440 Timeout while answering request” isn't very descriptive.
Each message should be numbered & documented.
Each message should be on single line instead of being split in multiple lines:
Exact message improvements should be listed here:
Zabbix database schema should be updated to clearly separate data and configuration tables. This would allow for separate backups and restores without overwriting some configuration or restoring weird data states.
All tables should be documented (spec required).
It should be possible to automatically send a report by mail periodically or save it to disk.
It should be possible to assign several IP addresses/DNS names to a host, thus allowing checking on both of them and not triggering all of the items if one interface goes down.
It should be possible to create various trigger/action language versions, and assign a language to each user. This would allow to send generic versions to automated service desk interfaces and localised versions for some users.
It should be possible to assign a web scenario to a template
There are items that should be queried only one at a time. Possibility to add such support to zabbix should be evaluated.
Solution would allow defining item groups and assigning items to them. Then, no more than one item from the group will be queried at once (next item is queried only after previous query returns).
Maybe groups should have configurable amount of max simultaneous qeries.
Use cases:
Implemented serverside or on agent ?
Ability to carry out an action only once within a configured interval and/or only during specific time frames. (action throttling)
That is, if a trigger is changing state every minute for 3 hours, we want to record that, but we would throttle notifications for these actions to one per hour or so.
Maybe doable in the escalations framework.
An API is desirable to make automation, integration and additional tool development possible. API specification/wishlist.
Along with API, a commandline tool should be develop to expose API features and demonstrate capabilities.
Some sort of tree-like structure for all possible resources:
/View/Graphs/Graph1 /Hosts/Host1/Applications/App1/Items/...
For example, “system health/fans” etc
It should be possible to nest host and user groups.
There should be a hierarchical display for viewing such groups in Administration → Users → User groups.
All group dropdown selectors should show groups hierarchically whenever possible (a lot of subgroups could expand them a lot).
Should groups be able to contain both groups and hosts ? (probably a group should be allowed to contain only one type of sub-elements – either hosts/users, or other groups. It should be possible to change what a group contains – maybe by allowing to remove all existing elements from the group, then adding new elements of the other category)
Circular nesting should be checked and prohibited
What about upgrades – maybe it's worth providing simple upgrade path for people who use some notation schema currently (for example, allowing definition of group separator, thus group “UPS :: Something” group would create two groups – “UPS” and “Something”, where “Something” would be a subgroup of “UPS”.
It should be possible to create a trigger that notifies if an item switches to “unsupported” state.
There are items that are used to create aggregate data in Zabbix, but are queried on different intervals. As these items change, stored data is not realistic - for example, memory queries happen at different time, thus total memory fluctuates above & below 100%, never really being 100%.
A solution could be item groups that should always be queried in a single request to agent, and agent in turn would try to grab the values as close to each other as possible. Ideally, if agent has to parse single file in /proc for multiple values, it takes a snapshot of this file & parses that.
Similar to previous, but not quite the same.
There should be a way to return multiple values in a user parameter (with custom delimiter) and get those into multiple items. This is desirable both for values that make sense qhen queried at the same time, and also because that would be cheaper in most cases.
Agent and server communication should be encrypted.
Could be implemented with preshared keys (Bacula style), certificates or whatever.
Should also include all connections to/from proxies.
It should be possible to automatically discover items to monitor. Results should be easily propagatable to template. There should be a way to initiate detection from the UI.
Item categories for automatic discovery:
To make this realistic, maybe user can configure possible values (network interface names, mountpoints etc) that are queried later.
Ideally, discovery based items are attached to template, discovery is performed on each host and then items for that host only are modified according to discovery results.
Based on agent being able to [semi]auto-discover items to monitor (like mounted fs, network interfaces etc).
When agent is [re]discovering these, there should be an option to create/update graph, containing discovered items.
It should be possible to use calculations in item keys. Such keys could be either native (thus allowing to directly specify subkeys in the expression), or only possible from existing items.
First approach would also require an intelligent mechanism to check that used keys are available in other items and not retrieve data again.
There should be a possibility to change agent configuration centrally, from the Zabbix server. Parts that should be centrally manageable:
Decent security model is critical.
Currently proxy connects to the server. This is troublesome in environments where proxies reside on restricted networks, so connecting other way should be supported (similar to normal/active agents).
There should be a way to hook into Zabbix discovery process & return data from arbitrary modules to the discovery process.
While this could be implemented as an external process which could later modify configuration in one way or another, having it integrated in Zabbix better would be better.
There should be a way to 'stack' discovery modules - for example, arpwatch module could discover hosts, another module would determine operating system, and operating specific module would determine specific items/templates to link to etc.
Communication method and required/optional parameters from discovery should be specified here.
Zabbix should be able to automatically gather information for inventory.
It should be possible to link any item to any inventory field.
A process that could check the database for errors and consistency would be helpful. It could check various things, starting with conformance to schema, missing indexes, ending with verifying that all linked entity ids are existing etc.
Ability to gather & submit stats of installations (purely opt-in) could be useful. Inofrmation like amount of hosts, items, triggers, templates, db used, template nesting usage, other fatures usage could be gathered & supplied.
There should be an ability to filter triggers by severity (including ability to filter higher/lower than). This includes trigger status as a screen element.
Note, only 'higher than' is currently implemented, but considered sufficient
Location : Monitoring → Triggers
In the following pages only fields where at least one item is assigned should be coloured :
Location : Monitoring → Triggers → STATUS OF TRIGGERS (link at the top of the page)
Location : Monitoring/Configuration → Screens
Location : Monitoring → Queue
Even if a trigger depends on another trigger that is in a “TRUE” state, it is still counted in “STATUS OF TRIGGERS” page. Dependent triggers should not be counted in this overview.
Location : Monitoring → Triggers → STATUS OF TRIGGERS
If an object has two triggers assigned and one depends on another, map shows “2 problems”.
Dependent triggers should not be counted in maps.
Location : Monitoring → Maps
If item type is integer, graph should use integers only for y axis.
It should be possible to filter audit log by:
It should be possible to set multiple filters simultaneously (for example, changes by a particular user to a particular hostgroup in a specified time period).
Server status currently checks for zabbix_server process on the local machine. This does not work in situations where frontend is on another machine than zabbix_server, or is chrooted on the same machine. Some other method should be used to check whether zabbix_server is running.
When inserting screen element “Triggers info”, there should be an ability to filter these by group.
Location : Configuration → Screens → Screens
There should be an ability to add non-monitored items to maps to improve readability. This currently can be imitated by adding them to the background image, but that is not flexible enough.
Location : Configuration → Maps
Current cloning feature preserves template linkage, but creates no items and triggers.
An improved cloning feature would allow to choose whether the cloned host should also have directly attached items and triggers.
There should be an easy way to assign a template to all hosts in a particular group.
Current trigger configuration places trigger expression as the second column. In case of longer trigger expressions, this pushes other columns past screen border. This view should be improved by moving expression column to the be the rightmost one. Note, implemented so there's still error column after expression column.
Currently it is not possible to edit media assigned to a user, it has to be deleted and recreated. It should be possible to directly edit media.
Location : Administration → Users
When exporting a host that is linked against a template, not the linkage, but items, triggers etc themselves are exported.
Export and import should allow preserving template linkage.
Should template linkage be exported against an ID or against a name ? What to do if upon import the linked template is missing ?
Location : Configuration → Export/Import
Screen editing should allow row and column insertion in arbitrary locations as well as arbitrary row and column deletion.
Location : Configuration → Screens
If user is not logged in, show last login after “No” in “Is online?” column (or “Never logged in”).
Location : Configuration → Users
A new macro/variable should be added – {TRIGGER.NPRIORITY}, which would expand to the numerical priority of the trigger.
zabbix_proxy now can be used for that