Zabbix is randomly getting disconnected from BMC for some reason.. It will reconnect usually within 5 minutes but during this time the blades appear to be offline and would alert based on this critical event. The network path is very simple .. it goes through 1 juniper switch which is not reporting any connectivity issues. I've noticed some discussion about openipmi developmers not really being that forthcoming with changes to their code and it has caused problems with Zabbix code in the past. So i'm curious if anyone else is experiencing this issue. I disabled all other items for the ipmi template. Only the power status check is currently enabled. Only 1 host is being queried for ipmi related items. I am querying against a supermicro x10sdd-f board running IPMI Revision 2.0 code
Connectivity is fine..
5000 packets transmitted, 5000 received, 0% packet loss, time 5117509ms
rtt min/avg/max/mdev = 0.228/0.600/40.788/2.650 ms
root@zab-web-1:~# dpkg -l | grep openipmi
ii libopenipmi0 2.0.22-1.1ubuntu2.1 amd64 Intelligent Platform Management Interface - runtime
ii openipmi 2.0.22-1.1ubuntu2.1 amd64 Intelligent Platform Management Interface (for servers)
ii zabbix-agent 1:4.0.4-1+bionic amd64 Zabbix network monitoring solution - agent
ii zabbix-frontend-php 1:4.0.4-1+bionic all Zabbix network monitoring solution - PHP front-end
ii zabbix-release 1:4.0-2+bionic all Zabbix official repository configuration
ii zabbix-server-mysql 1:4.0.4-1+bionic amd64 Zabbix network monitoring solution - server (MySQL)
zabbix_server.log
18335:20190221:144611.606 WARN: 0 0 ipmi_lan.c(lost_connection): Connection 0 to the BMC is down
18335:20190221:144611.606 SEVR: 0 0 ipmi_lan.c(lost_connection): All connections to the BMC are down
18335:20190221:144611.606 EINF: 0(23.1).power chassis.c(chassis_power_get_cb): Received IPMI error: ff ..... i've also seen error: c3
18335:20190221:144611.609 control 'power@[x.x.x.x]:623' deleted
zabbix_server.conf
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=0
DebugLevel=4
PidFile=/var/run/zabbix/zabbix_server.pid
SocketDir=/var/run/zabbix
DBName=zabbix
DBUser=zabbix
DBPassword=mysqlpassword
StartIPMIPollers=3
StartPollersUnreachable=1
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
Timeout=4
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/bin/fping
Fping6Location=/usr/bin/fping6
LogSlowQueries=3000
Connectivity is fine..
5000 packets transmitted, 5000 received, 0% packet loss, time 5117509ms
rtt min/avg/max/mdev = 0.228/0.600/40.788/2.650 ms
root@zab-web-1:~# dpkg -l | grep openipmi
ii libopenipmi0 2.0.22-1.1ubuntu2.1 amd64 Intelligent Platform Management Interface - runtime
ii openipmi 2.0.22-1.1ubuntu2.1 amd64 Intelligent Platform Management Interface (for servers)
ii zabbix-agent 1:4.0.4-1+bionic amd64 Zabbix network monitoring solution - agent
ii zabbix-frontend-php 1:4.0.4-1+bionic all Zabbix network monitoring solution - PHP front-end
ii zabbix-release 1:4.0-2+bionic all Zabbix official repository configuration
ii zabbix-server-mysql 1:4.0.4-1+bionic amd64 Zabbix network monitoring solution - server (MySQL)
zabbix_server.log
18335:20190221:144611.606 WARN: 0 0 ipmi_lan.c(lost_connection): Connection 0 to the BMC is down
18335:20190221:144611.606 SEVR: 0 0 ipmi_lan.c(lost_connection): All connections to the BMC are down
18335:20190221:144611.606 EINF: 0(23.1).power chassis.c(chassis_power_get_cb): Received IPMI error: ff ..... i've also seen error: c3
18335:20190221:144611.609 control 'power@[x.x.x.x]:623' deleted
zabbix_server.conf
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=0
DebugLevel=4
PidFile=/var/run/zabbix/zabbix_server.pid
SocketDir=/var/run/zabbix
DBName=zabbix
DBUser=zabbix
DBPassword=mysqlpassword
StartIPMIPollers=3
StartPollersUnreachable=1
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
Timeout=4
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
FpingLocation=/usr/bin/fping
Fping6Location=/usr/bin/fping6
LogSlowQueries=3000