Ad Widget

Collapse

Suggestion: Changing of the Item status by the trigger switching.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Engraf
    Member
    • Sep 2014
    • 41

    #1

    Suggestion: Changing of the Item status by the trigger switching.

    Suggestion: Changing of the Item (discovery rule/item prototype) status by the trigger switching (i.e. enabled/disabled).

    For example: there are several snmp oids in the management mibs that represent the overall status of some server subsystem. And there are many others that represent detailed values of the corresponding subsystem. Let's say the latter are monitoring in zabbix by some discovery rule. So it would be convenient to enable/disable that discovery rule depending on the certain value of overall status, probably by the trigger action.

    P.S. Hope my English is readable...
  • kloczek
    Senior Member
    • Jun 2006
    • 1771

    #2
    Can you give us exact OIDs example to show how how it may/could work?
    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
    https://kloczek.wordpress.com/
    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
    My zabbix templates https://github.com/kloczek/zabbix-templates

    Comment

    • Engraf
      Member
      • Sep 2014
      • 41

      #3
      For example: CPQIDA-MIB::cpqDaPhyDrvHasMonInfo (OID: 1.3.6.1.4.1.232.3.2.5.1.1.36).
      The description says:
      Physical Drive Has Monitor Information. All of the physical disk table fields except for the physical disk status (phyDrvStatus) and the bay location (phyDrvBayLocation) are invalid unless this field has a value of true(2).
      So it would be good to disable items corresponding to that table fields if this oid value is not true(2).

      Another example: CPQHLTH-MIB::cpqHeCritLogSupported (OID: 1.3.6.1.4.1.232.6.2.2.1).
      If it's "notSupported(2)" then it would be better to disable discovery rule that discover element of the table CPQHLTH-MIB::cpqHeCriticalErrorTable (OID: 1.3.6.1.4.1.232.6.2.2.4).
      Last edited by Engraf; 30-01-2019, 13:48.

      Comment

      • kloczek
        Senior Member
        • Jun 2006
        • 1771

        #4
        The same situation is with use my IF-MIB template on exact devices.
        https://github.com/kloczek/zabbix-te...ter/MIB/IF-MIB
        On some devices it make sense to use IF-MIB::{in,out}Octests and on the other IF-MIB::{in,out}hcOctests.
        On using this template can be used two approaches:

        1) Use two separated LLDs to populate those two groups of items, triggers graphs etc but with different filters definitions,
        This does not work because currently zabbix does not allow use multiple LLDs with exactly the same LLD item definition.

        By this would be possible to generate or not whole set of items/triggers/graphs depends on which LLD enabled/disabled.
        My idea about provide necessary level of the flexibility using generic IF-MIB based template was use two times LLD item like:
        Code:
        discovery[{#IFDESCR},IF-MIB::ifDescr,{#IFINDEX},IF-MIB::ifIndex,{#IFADMINSTATUS},IF-MIB::ifAdminStatus]
        and depends on {#IFADMINSTATUS} populate only IF-MIB::ifAdminStatus] items when interface is down and second LLD when interface is up to populate all necessary metrics per actively used interfaces.
        In other words by use two LLDs with exactly the same LLD input item and different filters theoretically is/would be possible to solve your case as well.
        This is kind of important because some modular switches may have hundreds entries per each port even if modules are not installed. This creates kind of pathological situation when most of the LLD items will be with useless data (with zeroed values) or in unsupported state and only few will be collection useful data. Other issue is with populating graphs or triggers which will be generated and evaluated (in case of triggers) and additionally come memory needs to be allocated for all thoise items with not active interfaces. On presentation layer in list of device graphs list of those graphs will be very long and it will be quite difficult to find those graphs which are for active interfaces.
        Using two LLDs would allows to populate all LLD objects like items, graphs, triggers only when they needed/for all needed interfaces.

        [1] It is yet another obstacle. Even if use multiple LLD with the same LLD item will possible server/proxy is not able to combine querying multiple OIDs groups from multiple LLD on forming SNMP bulk queries. At the moment zabbix is able to form single bulk query to obtain multiple OIDs data but only within all those OIDS generated from single LLD.
        This is important as well because I found that many devises with SNMP agents have real problems (SNMP timeouts) on above some bandwidth queried OIDs/s but after reaching exact bandwith of SNMP queries/s. It is clearly visible over my SNMPv-MIB template data which provides monitoring number of SNMP messaages/s (SNMPv2-MIB::snmp{In,Out}Pkts)

        IMO it is some issue with net-snmp snmpd code which is producing SNMP timeouts after reaching exact rate of SNMP queries/s. I suppose that it must be some locking contention which is causing those SNMP timeouts. Zabbix developers already spend significant resources to mitigate those issues and provide some workarounds however IMO this needs to be solved on SNMP agent side because in some non trivial cases it is possible to observe SNMP timeouts doing less than 1-2 SNMP (bulk or not) queries per 5-10s.

        2) Use single LLD to populate everything what is possible and then manipulate each item and trigger state by disable/enable it depends on values of some already collected items (like IF-MIB::ifAdminStatus in case of my IF-MIB template).

        This approach is worse because:
        - it generates higher memory pressure on configuration memory cache because even if item is not enabled metadata of those items are stored in ConfigCache.
        - in case populating graphs there is no such thing like enable/disable graph and all graphs (for items which are not active and not) are listed in list of graphs. The same is with items, triggers, web scenarios and host prototypes.

        Using this approach in case of SNMP has one advantage because as long as all items are generated by single LLD they can be combined within lower number of SNMP bulk queries.

        Just to be clear: above is not straight comment related to your proposition.
        It is only summary of what I found trying to solve similar issue without stepping on zabbix code to change its behaviour


        If you have some vision how to provide some web frontend UI details which may allow define enable/disable items/graphs/web scenarios/ hosts prototypes depends on values of some already populated by LLD items value that would be great and this may provide some some generic solution far beyond only SNMP area
        All because it would be possible to use current SNMP bulk queries batching which is still working within only single LLD.

        If you have such vision how to solve above on UI and internally on processing LLD with such additional necessary here level of flexibility

        Even if you would be able to provide some draft/sketch of some necessary details I think that all above can be now relatively easily solved by relaxing current rule of using multiple LLDs with the same LLD item as long as they will be using different filters and will be not populating exactly the same items.

        Even if current limitation will be solved in a way which I propose I think that still few things needs to be polished within zxabbix:

        1) Proxy and server are sharing the same code which is used on query SNMP items. This code receives on input json with all configuration data. IMO whole batch of those data should be preprocessed to find out that some multiple OIDs defined by LLDs and even normal static items should be combined on forming SNMP bulk queries. Only this will make whole zabbix interaction over SNMP more robust.

        2) Workout with net-snmp developers to find out cause all currently observed SNMP timeouts.

        As well IMO 1) should not be limited to only SNMP.
        IMO generally include/module.h::ZBX_METRIC type should be extended by apply patch like below:

        Code:
        Index: include/module.h
        ===================================================================
        --- include/module.h    (revision 89206)
        +++ include/module.h    (working copy)
        @@ -49,7 +49,8 @@
         {
                char            *key;
                unsigned        flags;
        -       int             (*function)();
        +       int             (*function)();  /* item function obtaining single value */
        +       int             (*mfunction)(); /* item function obtaining multiple values */
                char            *test_param;    /* item test parameters; user parameter items keep command here */
         }
         ZBX_METRIC;
        Currently each ZBX_METRIC.function holds pointer to the function which allows sample single value of some item definition.
        Simple extension when on single query can be obtained multiple data is obtaining CPU utilisation items would be possible to obtain user, system, idle and other CPU metric out of internal agent collector.
        The same is with processing all memory metrics. Currently if in template are multiple vm.memory.size[] keys all those items data sampling are opening and closing multiple times /proc/memoryinfo only by this all those data are slightly desync because each time all those metrics data are sampled sequentially. What if it will be possible to preprocess all agent configuration and find out because vm.memory.size key ZBX_METRIC.mfunction() has non-null value it can be used one time open /proc/meminfo to obtain all memory related items data.

        What I'm trying to say that the same code used to form faster data sampling of multiple data from the same source can be used on serve and proxy in simple checks, or any agent less item types like SNMP, IPMI, telnet, ssh, web checks as well and other .. over JMX it is possible to query multiple mbeans data as well.

        I've started working on such extension more than three years ago. When I was in Riga on conference when new zabbix with master and dependent items has been presented I told in Q&A part that I have better idea how to solve sampling multiple values out of single/the same source.
        Now I think that this less about better/worse idea that it would be good to have both techniques

        Sometimes if low level data sampling code can provide ZBX_METRIC.mfunction() transparently zabbix would be able to obtain faster and better (not desynced) data. All transparently without thinking that all items are operating on exactly the same source of metrics data.
        However sometimes using straight master/dependent items can make sense as well

        In attachment is full patch which adds ZBX_METRIC.mfunction() to the point where zabbix code compiles correctly. Effectively it is only begin of whole mfunction modifications because whole code using those addtionam mfuncion() addresses still is missing.
        Next step should be as well for example write VM_MULTI_MEMORY_SIZE().
        Still needs to be solved how to pass list of metrics which needs to be sampled on mfuncion() subroutines and how to return list of obtained values.

        As I said it is only humble begin .. nevertheless with relaxed LLDs multiple the same items definition above may provide solution on SNMP area.
        Attached Files
        Last edited by kloczek; 30-01-2019, 14:18.
        http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
        https://kloczek.wordpress.com/
        zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
        My zabbix templates https://github.com/kloczek/zabbix-templates

        Comment

        Working...