Ad Widget

Collapse

Zabbix Agent should report about RAM ECC-Errors

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • thooge
    Junior Member
    • Mar 2014
    • 10

    #1

    Zabbix Agent should report about RAM ECC-Errors

    Hello,

    i had some days ago a serious server breakdown due to RAM failure.
    So i checked where to look up for ECC errors to monitor them.

    Seems so the zabbix agend has no key for that?

    I know i can realize that with an UserParameter, but this is important
    enough to be implemented directly inside the agent.

    The corrected and uncorrected errors should be visible in
    /sys/devices/system/edac/mc/mc0/ce_count
    and
    /sys/devices/system/edac/mc/mc0/ue_count

    In my opinion a value > 0 inside ue_count should be so seriously
    to shutdown the server immediately and get repaired.

    Kind regards
    Thomas
  • mgielissen
    Junior Member
    • Nov 2009
    • 8

    #2
    I created a template: https://github.com/mgielissen/zabbix...inux_edac.yaml

    Comment

    Working...