Ad Widget

Collapse

Detect Memory DIMM Errors through IPMI more accurately

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • dunehunter
    Junior Member
    • Jan 2024
    • 20

    #1

    Detect Memory DIMM Errors through IPMI more accurately

    Hi! A few weeks ago our server shutdown because of one of the memory dimm encounters uncorrectable ECC error. In our host configuration, we are using the default Template Server Chassis by IPMI template but this template only has a Information severity level trigger about DIMM value change, and therefore we missed this alert.

    So I'm wondering if we want to be aware of these kind of errors, are there any better template that can detect DIMM ECC errors other than report these as imformation?

    Or maybe I should not use IPMI to detect memory DIMM error? If so what interface shall I use to detect these?

    Any help would be grateful, thanks!

    Click image for larger version

Name:	19aa34b571cde379a5203d33dc5a6c5.png
Views:	173
Size:	76.6 KB
ID:	491648
    Click image for larger version

Name:	image.png
Views:	107
Size:	98.8 KB
ID:	491649
  • cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4807

    #2
    Maybe it is enough to rise the severity so you would notice if?

    Comment

    Working...