Ad Widget

Collapse

Discussion thread for official Zabbix Template IPMI

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Discussion thread for official Zabbix Template IPMI


    This thread is designed to provide grounds for discussion of the official Zabbix Template for IPMI.
    The template and details of the template will be available in GIT repository.
    Zabbix is always looking for ways to improve our services and to make our users happier.
    We pride ourselves on doing our best each and every day, but we know that there is always something more to learn.
    We would like to hear back from you to know what have you liked and what would you improve in the template.

    #2
    As posted in ZBXNEXT-5527, I have tested the template (085e4ea53e3) against about 15 different SuperMicro motherboards, and one Dell Motherboard.

    EDIT: Forgot to mention, this is a daily Ubuntu 20.04 build running 5.0.3Alpha.

    I've made a table of the ipmi_sensor keys that ended up having isues for each MB.
    I've also used 2 keys, where value 1 is a generic issue like
    Code:
    Value "3.225000" of type "string" is not suitable for value type "Numeric (unsigned)
    While a value of 2 is a more pronounced issue, a table of "2" values for each MB is pasted below the big table.
    I've also tried to use "variables" as much as possible, because whether its 1.3, 3.3, 5, 12V or any other value, it just makes the table unruly.

    Hopefully all of this formatting comes out well/correctly.
    I'm not sure how to pull the raw LLD output, but I can provide that as well if that would be helpful as well.
    Sensor Name H8-DGI-F X8-DTT-F X8-DTT-HEF+ X9-DBL-3F X9-DRG-HF+ X9-SRL-F X10-DRG-OT+ X10-DRT-PT X10-DRU-i+ X10-DRW-iT X11-DDW-NT X11-DPG-OT X11-DPH-T H11-DST-B H11-DSU-iN PowerEdge R610
    ipmi_sensor[-{#VOLTAGE}V] 1
    ipmi_sensor[{#VOLTAGE} PLL PG] 2
    ipmi_sensor[{#VOLTAGE}] 1 1 1 1 1 1 1 1 1 1 1
    ipmi_sensor[{#VOLTAGE}V AUX PG] 2
    ipmi_sensor[{#VOLTAGE}V BMC] 1 1 1 1
    ipmi_sensor[{#VOLTAGE}V LOM PG] 2
    ipmi_sensor[{#VOLTAGE}V PCH] 1 1 1 1 1 1 1
    ipmi_sensor[{#VOLTAGE}V PG] 2
    ipmi_sensor[{#VOLTAGE}VCC] 1 1 1 1 1 1 1 1 1 1
    ipmi_sensor[+{#VOLTAGE}V] 1 1 1 1 1
    ipmi_sensor[+{#VOLTAGE}VSB] 1 1 1 1 1 1 1 1 1 1 1 1
    ipmi_sensor[AVCC] 1
    ipmi_sensor[Cable SAS {#A/B}] 2
    ipmi_sensor[Chassis Intru] 2 2 2 2 2 2 2 2 2 2
    ipmi_sensor[CMOS Battery] 2
    ipmi_sensor[CPU VTT] 1
    ipmi_sensor[CPU{#CPUNUM} DIMM] 1 1 1
    ipmi_sensor[CPU{#CPUNUM} Temp] 2 2 2
    ipmi_sensor[CPU{#CPUNUM} Vcore] 1 1 1 1 1
    ipmi_sensor[CPU{#CPUNUM} VDIMM] 1
    ipmi_sensor[DKM Status] 2
    ipmi_sensor[Drive{#DRIVENUM}] 2
    ipmi_sensor[Fan Redundancy] 2
    ipmi_sensor[HDD Status ] 2
    ipmi_sensor[HEATSINK PRES] 2
    ipmi_sensor[iDRAC6 Ent PRES] 2
    ipmi_sensor[Intrusion] 2 2
    ipmi_sensor[MEM PG] 2
    ipmi_sensor[OS Watchdog] 2
    ipmi_sensor[P{#CPUNUM}_SOCDUAL] 1 1
    ipmi_sensor[P{#CPUNUM}_SOCRUN] 1 1
    ipmi_sensor[P{#CPUNUM}_VDDCR] 1 1
    ipmi_sensor[P{#CPUNUM}_VMEM{#DIMMSx4}] 1 1
    ipmi_sensor[Power Optimized] 2
    ipmi_sensor[Presence ] 2
    ipmi_sensor[Presence] 2
    ipmi_sensor[PS Redundancy] 2
    ipmi_sensor[PS Status] 2 2 2
    ipmi_sensor[PS{#PSNUM} Status] 2 2 2 2 2 2 2 2 2 2 2 2
    ipmi_sensor[PVNN PCH] 1 1 1
    ipmi_sensor[Riser Config] 2
    ipmi_sensor[RISER{#RISERNUM} PRES] 2
    ipmi_sensor[ROMB Battery] 2
    ipmi_sensor[Status] 2
    ipmi_sensor[STOR ADAPT PRES] 2
    ipmi_sensor[USB CABLE PRES] 2
    ipmi_sensor[VBAT] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2
    ipmi_sensor[VCORE PG] 2
    ipmi_sensor[Vcore] 1
    ipmi_sensor[Vcpu{#CPUNUM}] 1 1 1 1 1 1 1
    ipmi_sensor[VDD_33_DUAL] 1 1
    ipmi_sensor[VDD_5_DUAL] 1 1
    ipmi_sensor[VDIMM {#DIMM1}{#DIMM2}] 1 1 1 1 1
    ipmi_sensor[VDIMM] 1
    ipmi_sensor[VDimmP{#CPUNUM}{#DIMMSx3] 1 1 1
    ipmi_sensor[VDimmP{#CPUNUM}{#MULTIDIM}]
    ipmi_sensor[VSB] 1
    ipmi_sensor[VTT ] 1 1 1
    ipmi_sensor[VTT PG] 2
    MB ERROR MSG
    H8-DGI-F Preprocessing failed for: [{"id":"PS Status","name":"(10.1).PS Status","sensor":{"type":8,"text":"power_supply"}, "reading":...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Intrusion')].value.first()": no data matches the specified path
    X8-DTT-F Preprocessing failed for: [{"id":"+5VSB","name":"(7.1).+5VSB","sensor":{"t ype ":2,"text":"voltage"},"reading":{"type":1,"tex ...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='CPU1 Temp')].value.first()": no data matches the specified path
    X8-DTT-HEF+ Preprocessing failed for: [{"id":"+5VSB","name":"(7.1).+5VSB","sensor":{"t ype ":2,"text":"voltage"},"reading":{"type":1,"tex ...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='CPU1 Temp')].value.first()": no data matches the specified path
    X9-DBL-3F Preprocessing failed for: [{"id":"PS1 Status","name":"(10.1).PS1 Status","sensor":{"type":8,"text":"power_supply"}, "reading...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X9-DRG-HF Preprocessing failed for: [{"id":"PS2 Status","name":"(10.2).PS2 Status","sensor":{"type":8,"text":"power_supply"}, "reading...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X9-SRL-F Preprocessing failed for: [{"id":"PS2 Status","name":"(10.2).PS2 Status","sensor":{"type":8,"text":"power_supply"}, "reading...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X10-DRG-OT+ Preprocessing failed for: [{"id":"GPU8 Temp","name":"(11.8).GPU8 Temp","sensor":{"type":1,"text":"temperature"},"re ading":{...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X10-DRT-PT Preprocessing failed for: [{"id":"HDD Status ","name":"(4.1).HDD Status ","sensor":{"type":13,"text":"drive_slot"},...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='PS1 Status')].value.first()": no data matches the specified path
    X10-DRU-i+ Preprocessing failed for: [{"id":"NVMe_SSD Temp ","name":"(1.4).NVMe_SSD Temp ","sensor":{"type":1,"text":"temperature"},...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X10-DRW-iT Preprocessing failed for: [{"id":"PS2 Status","name":"(10.2).PS2 Status","sensor":{"type":8,"text":"power_supply"}, "reading...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X11-DDW-NT Preprocessing failed for: [{"id":"PS2 Status","name":"(10.87).PS2 Status","sensor":{"type":8,"text":"power_supply"}, "readin...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X11-DPG-OT Preprocessing failed for: [{"id":"GPU10 Temp","name":"(11.10).GPU10 Temp","sensor":{"type":1,"text":"temperature"},"re ading...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    X11-DPH-T Preprocessing failed for: [{"id":"PS2 Status","name":"(10.87).PS2 Status","sensor":{"type":8,"text":"power_supply"}, "readin...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='Chassis Intru')].value.first()": no data matches the specified path
    H11-DST-B Preprocessing failed for: [{"id":"AOC_NIC_Temp ","name":"(r0.32.11.0).AOC_NIC_Temp ","sensor":{"type":1,"text":"tempera...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='VBAT')].value.first()": no data matches the specified path
    H11-DSU-iN Preprocessing failed for: [{"id":"PS2 Status","name":"(10.91).PS2 Status","sensor":{"type":8,"text":"power_supply"}, "readin...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='VBAT')].value.first()": no data matches the specified path
    R610 Preprocessing failed for: [{"id":"ROMB Battery","name":"(26.3).ROMB Battery","sensor":{"type":41,"text":"battery"},"re ading...

    1. Failed: cannot extract value from json by path "$.[?(@.id=='0.9V PG')].value.first()": no data matches the specified path

    Comment


      #3
      Upon looking further using ipmitool, at least for the "2" scenarios for the Supermicro boards, I believe this issue may be related to the unit being labeled "discrete", where others are Volts/RPM/etc.
      Code:
      3.3VCC           | 3.225      | Volts      | ok    | 2.613     | 2.681     | 2.885     | 3.718     | 3.922     | 3.990
      VBAT             | 0x4        | discrete   | 0x04ff| na        | na        | na        | na        | na        | na
      P1_VDDCR         | 1.120      | Volts      | ok    | 0.400     | 0.499     | 0.607     | 1.237     | 1.345     | 1.399
      P1_VMEMABCD      | 1.231      | Volts      | ok    | 0.979     | 1.003     | 1.081     | 1.387     | 1.465     | 1.489
      P2_VDDCR         | 1.143      | Volts      | ok    | 0.396     | 0.495     | 0.612     | 1.242     | 1.341     | 1.395
      P1_VMEMEFGH      | 1.228      | Volts      | ok    | 0.976     | 0.997     | 1.074     | 1.389     | 1.466     | 1.487
      VDD_5_DUAL       | 5.129      | Volts      | ok    | 4.019     | 4.139     | 4.439     | 5.729     | 6.029     | 6.149
      VDD_33_DUAL      | 3.327      | Volts      | ok    | 2.613     | 2.681     | 2.885     | 3.718     | 3.922     | 3.990
      P2_VMEMABCD      | 1.235      | Volts      | ok    | 0.976     | 0.997     | 1.074     | 1.389     | 1.466     | 1.487
      P2_VMEMEFGH      | 1.242      | Volts      | ok    | 0.976     | 0.997     | 1.074     | 1.389     | 1.466     | 1.487
      P1_SOCRUN        | 0.993      | Volts      | ok    | 0.300     | 0.496     | 0.629     | 1.070     | 1.147     | 1.343
      P2_SOCRUN        | 1.000      | Volts      | ok    | 0.300     | 0.496     | 0.629     | 1.070     | 1.147     | 1.343
      P1_SOCDUAL       | 0.900      | Volts      | ok    | 0.711     | 0.725     | 0.781     | 1.012     | 1.068     | 1.082
      P2_SOCDUAL       | 0.900      | Volts      | ok    | 0.711     | 0.725     | 0.781     | 1.012     | 1.068     | 1.082
      Chassis Intru    | 0x0        | discrete   | 0x0000| na        | na        | na        | na        | na        | na
      PS1 Status       | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na
      PS2 Status       | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na
      And this also appears to be the same with the R610 results, where most failed with a "2" type.
      Code:
      Riser Config     | 0x0        | discrete   | 0x0180| na        | na        | na        | na        | na        | na
      OS Watchdog      | 0x0        | discrete   | 0x0080| na        | na        | na        | na        | na        | na
      SEL              | na         | discrete   | na    | na        | na        | na        | na        | na        | na
      Intrusion        | 0x0        | discrete   | 0x0080| na        | na        | na        | na        | na        | na
      PS Redundancy    | 0x0        | discrete   | 0x0180| na        | na        | na        | na        | na        | na
      Fan Redundancy   | 0x0        | discrete   | 0x0180| na        | na        | na        | na        | na        | na
      CPU Temp Interf  | na         | discrete   | na    | na        | na        | na        | na        | na        | na
      Drive            | 0x0        | discrete   | 0x0180| na        | na        | na        | na        | na        | na
      Cable SAS A      | 0x0        | discrete   | 0x0180| na        | na        | na        | na        | na        | na
      Cable SAS B      | 0x0        | discrete   | 0x0180| na        | na        | na        | na        | na        | na
      DKM Status       | 0x0        | discrete   | 0x0080| na        | na        | na        | na        | na        | na
      Hoping that this will be helpful and point someone in the correct direction.

      Comment


        #4
        Originally posted by reedacus25 View Post
        As posted in ZBXNEXT-5527, I have tested the template (085e4ea53e3) against about 15 different SuperMicro motherboards, and one Dell Motherboard.

        EDIT: Forgot to mention, this is a daily Ubuntu 20.04 build running 5.0.3Alpha.

        Code:
        Value "3.225000" of type "string" is not suitable for value type "Numeric (unsigned)
        It's very strange, because of template_server_chassis_ipmi.xml has not items with Numeric (unsigned) type. Please download and import the template correctly.

        Comment


        • reedacus25
          reedacus25 commented
          Editing a comment
          Well, sadly this appears to have fixed things.
          Which is odd, due to the original XML I downloaded having matching md5sums to the one I re-downloaded and uploaded.
          So I guess something must have gotten mangled during the original XML import?
          Either way, it looks like things are working now with no unsupported items across all hosts.

      Announcement

      Collapse
      No announcement yet.
      Working...
      X