Hi,
We have some dell server installed that we was happily monitor via OpenManage + SNMP.
Now there is a new government project and we not allowed to install additional software on it. And I am looking for alternatives monitoring via DRAC.
There are few issues:
1. Dell have some sensors with same name: "Current" for both PSU. "Status" name for few sensors. Is it possible to monitor it somehow?
Presence | 4Ah | ok | 11.1 | Present
Presence | 48h | ok | 11.3 | Absent
Status | 60h | ok | 3.1 | Presence detected
Status | 61h | ok | 3.2 | Presence detected
2. It seems that I can monitor ventilator speeds and Ambient temperature, not even CPU temperature on some systems. But if I try to catch "Status" sensor
I am getting error:
sensor or control Status@[172.20.0.1]:623 does not exist.
Server log however see them:
3571:20160611:201932.872 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.872 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x8 ('power_supply') full_name:'0(10.2).Status0'
3571:20160611:201932.873 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.873 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x8 ('power_supply') full_name:'0(10.1).Status0'
3571:20160611:201932.873 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.873 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x7 ('processor') full_name:'0(3.2).Status0'
3571:20160611:201932.873 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.873 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x7 ('processor') full_name:'0(3.1).Status0'
Should I use: '0(10.2).Status0' '0(10.1).Status0' as a name?
3. And now most important one. Is it possible to monitor via DELL IPMI global system status, then LCD become AMBER and indicate error? Just like via OM+SNMP. So we don't need to create all list of sensors and triggers, but just one that indicate global health problem. I can't see any way. We can try and script:
ipmitool -I lanplus -H .... -U ..... -P ...... chassis status | grep Fault | grep true | wc -l
Tomorrow I will try and see if DRAC SNMP give any more reliable information. Day one IPMI is complete disappointment, why even people spend so much time to develop it.
We have some dell server installed that we was happily monitor via OpenManage + SNMP.
Now there is a new government project and we not allowed to install additional software on it. And I am looking for alternatives monitoring via DRAC.
There are few issues:
1. Dell have some sensors with same name: "Current" for both PSU. "Status" name for few sensors. Is it possible to monitor it somehow?
Presence | 4Ah | ok | 11.1 | Present
Presence | 48h | ok | 11.3 | Absent
Status | 60h | ok | 3.1 | Presence detected
Status | 61h | ok | 3.2 | Presence detected
2. It seems that I can monitor ventilator speeds and Ambient temperature, not even CPU temperature on some systems. But if I try to catch "Status" sensor
I am getting error:
sensor or control Status@[172.20.0.1]:623 does not exist.
Server log however see them:
3571:20160611:201932.872 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.872 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x8 ('power_supply') full_name:'0(10.2).Status0'
3571:20160611:201932.873 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.873 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x8 ('power_supply') full_name:'0(10.1).Status0'
3571:20160611:201932.873 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.873 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x7 ('processor') full_name:'0(3.2).Status0'
3571:20160611:201932.873 In allocate_ipmi_sensor() sensor:'Status0@[172.20.0.1]:623'
3571:20160611:201932.873 Added sensor: host:'172.20.0.1:623' id_type:0 id_sz:8 id:'Status0' reading_type:0x6f ('sensor specific') type:0x7 ('processor') full_name:'0(3.1).Status0'
Should I use: '0(10.2).Status0' '0(10.1).Status0' as a name?
3. And now most important one. Is it possible to monitor via DELL IPMI global system status, then LCD become AMBER and indicate error? Just like via OM+SNMP. So we don't need to create all list of sensors and triggers, but just one that indicate global health problem. I can't see any way. We can try and script:
ipmitool -I lanplus -H .... -U ..... -P ...... chassis status | grep Fault | grep true | wc -l
Tomorrow I will try and see if DRAC SNMP give any more reliable information. Day one IPMI is complete disappointment, why even people spend so much time to develop it.