Hi
Zabbix server 6.0.29
Rocky linux 8
Using snmp v3
I have an issue with zabbix snmp checks.
I am using "Generic by SNMP" template for dell hardware on nutanix.
There is an item "Generic SNMP: SNMP agent availability" where there is zabbix internal check "zabbix[host,snmp,available]"
The issue is, whenever I patch the software on that host, there will be network errors on host (probably because, the host is not available for the brief period of time).
But that is not the issue here. The main issue is the last row of this log.
I have confirmed, from zabbix_server.conf file, there is UnavailableDelay=60 (so default), so it should start working after 60 seconds of delay after such message.
And it will never be available again (at least not before I restart the zabbix-server systemd service).
There will be no rows written to zabbix_server.log file after such row for that host. Even when I manually initate a check on that host - nothing appears there.
Why is that ?
The error I get from that host, well "SNMP "Not Available", and "Timeout while connecting to "ip:161"
But I can confirm, there is no timeout. I have tried snmpwalk from zabbix server command line and I can retrieve the items just fine. Also, I can check snmp items via zabbix GUI as well - there is no issue.
Zabbix server 6.0.29
Rocky linux 8
Using snmp v3
I have an issue with zabbix snmp checks.
I am using "Generic by SNMP" template for dell hardware on nutanix.
There is an item "Generic SNMP: SNMP agent availability" where there is zabbix internal check "zabbix[host,snmp,available]"
The issue is, whenever I patch the software on that host, there will be network errors on host (probably because, the host is not available for the brief period of time).
But that is not the issue here. The main issue is the last row of this log.
Code:
98218:20240603:184913.462 temporarily disabling SNMP agent checks on host "host1.domain.com": interface unavailable
And it will never be available again (at least not before I restart the zabbix-server systemd service).
There will be no rows written to zabbix_server.log file after such row for that host. Even when I manually initate a check on that host - nothing appears there.
Why is that ?
The error I get from that host, well "SNMP "Not Available", and "Timeout while connecting to "ip:161"
But I can confirm, there is no timeout. I have tried snmpwalk from zabbix server command line and I can retrieve the items just fine. Also, I can check snmp items via zabbix GUI as well - there is no issue.
Code:
[root@ee02-zabbix ~]# cat /var/log/zabbix/zabbix_server.log | grep host1.domain.com 98172:20240602:044927.500 SNMP agent item "citAvgLatencyUsecs[NutanixManagementShare.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240602:044942.047 resuming SNMP agent checks on host "host1.domain.com": connection restored 98214:20240602:170227.450 SNMP agent item "citAvgLatencyUsecs[HYCU-cd12ff48-ecce-448f-9a57-f853483b9f7f.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240602:170242.075 resuming SNMP agent checks on host "host1.domain.com": connection restored 98190:20240602:172627.174 SNMP agent item "citIOPerSecond[HYCU-9a37afc8-92f2-4f21-82e3-f74193258c89.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240602:172642.392 resuming SNMP agent checks on host "host1.domain.com": connection restored 98202:20240602:173727.537 SNMP agent item "dstAverageLatency[2]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98219:20240602:173742.160 resuming SNMP agent checks on host "host1.domain.com": connection restored 98172:20240603:120557.571 SNMP agent item "system.net.uptime[sysUpTime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:120612.312 resuming SNMP agent checks on host "host1.domain.com": connection restored 98213:20240603:151327.299 SNMP agent item "citAvgLatencyUsecs[NTNX_d1-res-nas_ctr.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:151342.066 resuming SNMP agent checks on host "host1.domain.com": connection restored 98178:20240603:153727.710 SNMP agent item "citAvgLatencyUsecs[HYCU-525e34df-7503-4f03-b18f-cc1e22b38f96.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:153742.371 resuming SNMP agent checks on host "host1.domain.com": connection restored 98210:20240603:163227.550 SNMP agent item "hypervisorAverageLatency[2]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:163242.207 resuming SNMP agent checks on host "host1.domain.com": connection restored 98209:20240603:164927.490 SNMP agent item "dstNumFreeBytes[28]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:164942.085 resuming SNMP agent checks on host "host1.domain.com": connection restored 98195:20240603:170027.134 SNMP agent item "dstIOBandwidth[22]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:170042.067 resuming SNMP agent checks on host "host1.domain.com": connection restored 98211:20240603:170131.946 SNMP agent item "citIOPerSecond[default-container-55284636057635.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:170150.138 SNMP agent item "citIOPerSecond[default-container-55284636057635.]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98216:20240603:170207.029 resuming SNMP agent checks on host "host1.domain.com": connection restored 98162:20240603:170207.442 item "host1.domain.com:citIOPerSecond[default-container-55284636057635.]" became not supported: Value of type "string" is not suitable for value type "Numeric (unsigned)". Value "NULL" 98178:20240603:170327.844 SNMP agent item "dstNumberIops[22]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:170346.125 SNMP agent item "dstNumberIops[11]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98216:20240603:170405.150 SNMP agent item "hypervisorIOBandwidth[2]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98216:20240603:170424.179 temporarily disabling SNMP agent checks on host "host1.domain.com": interface unavailable 98217:20240603:171048.618 enabling SNMP agent checks on host "host1.domain.com": interface became available 98163:20240603:171123.604 item "host1.domain.com:citIOPerSecond[default-container-55284636057635.]" became supported 98176:20240603:171331.918 SNMP agent item "dstAverageLatency[33]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:171350.829 SNMP agent item "dstAverageLatency[33]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98216:20240603:171409.851 SNMP agent item "dstAverageLatency[33]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98217:20240603:171428.865 temporarily disabling SNMP agent checks on host "host1.domain.com": interface unavailable 98217:20240603:172432.621 enabling SNMP agent checks on host "host1.domain.com": interface became available 98177:20240603:172531.747 SNMP agent item "citAvgLatencyUsecs[NTNX_d1-res-nas_ctr.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:172546.920 resuming SNMP agent checks on host "host1.domain.com": connection restored 98201:20240603:173043.761 SNMP agent item "hypervisorCpuUsagePercent[3]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:173058.273 resuming SNMP agent checks on host "host1.domain.com": connection restored 98205:20240603:173131.976 SNMP agent item "dstAverageLatency[5]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:173146.360 resuming SNMP agent checks on host "host1.domain.com": connection restored 98209:20240603:173411.722 SNMP agent item "system.net.uptime[sysUpTime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:173426.828 resuming SNMP agent checks on host "host1.domain.com": connection restored 98172:20240603:173751.377 SNMP agent item "dstAverageLatency[12]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240603:173810.255 SNMP agent item "dstAverageLatency[12]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98217:20240603:173829.282 SNMP agent item "dstNumberIops[23]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98217:20240603:173848.297 temporarily disabling SNMP agent checks on host "host1.domain.com": interface unavailable 98215:20240603:174836.862 enabling SNMP agent checks on host "host1.domain.com": interface became available 98199:20240603:175127.659 SNMP agent item "dstAverageLatency[8]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:175142.302 resuming SNMP agent checks on host "host1.domain.com": connection restored 98196:20240603:175511.244 SNMP agent item "citIOPerSecond[NutanixManagementShare.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:175526.548 resuming SNMP agent checks on host "host1.domain.com": connection restored 98204:20240603:180051.722 SNMP agent item "hypervisorTxBytes[3]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:180106.932 resuming SNMP agent checks on host "host1.domain.com": connection restored 98213:20240603:180131.971 SNMP agent item "hypervisorIOBandwidth[2]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:180146.032 resuming SNMP agent checks on host "host1.domain.com": connection restored 98194:20240603:180552.106 SNMP agent item "system.net.uptime[sysUpTime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240603:180607.376 resuming SNMP agent checks on host "host1.domain.com": connection restored 98201:20240603:180701.139 SNMP agent item "system.hw.uptime[hrSystemUptime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98219:20240603:180716.530 resuming SNMP agent checks on host "host1.domain.com": connection restored 98209:20240603:180731.968 SNMP agent item "citAvgLatencyUsecs[SelfServiceContainer.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240603:180746.559 resuming SNMP agent checks on host "host1.domain.com": connection restored 98210:20240603:181012.170 SNMP agent item "system.net.uptime[sysUpTime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:181027.849 resuming SNMP agent checks on host "host1.domain.com": connection restored 98178:20240603:181511.205 SNMP agent item "hypervisorTxDropCount[2]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:181526.347 resuming SNMP agent checks on host "host1.domain.com": connection restored 98190:20240603:182511.555 SNMP agent item "dstState[15]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:182526.430 resuming SNMP agent checks on host "host1.domain.com": connection restored 98203:20240603:183101.606 SNMP agent item "system.hw.uptime[hrSystemUptime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240603:183116.842 resuming SNMP agent checks on host "host1.domain.com": connection restored 98203:20240603:183131.645 SNMP agent item "dstAverageLatency[1]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240603:183146.869 resuming SNMP agent checks on host "host1.domain.com": connection restored 98202:20240603:183511.358 SNMP agent item "dstNumFreeBytes[12]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98216:20240603:183526.396 resuming SNMP agent checks on host "host1.domain.com": connection restored 98177:20240603:183631.600 SNMP agent item "system.hw.uptime[hrSystemUptime.0]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98218:20240603:183646.510 resuming SNMP agent checks on host "host1.domain.com": connection restored 98204:20240603:183731.768 SNMP agent item "dstAverageLatency[8]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98219:20240603:183746.757 resuming SNMP agent checks on host "host1.domain.com": connection restored 98203:20240603:184351.597 SNMP agent item "dstAverageLatency[13]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:184406.109 resuming SNMP agent checks on host "host1.domain.com": connection restored 98172:20240603:184511.589 SNMP agent item "dstNumberIops[16]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98217:20240603:184526.188 resuming SNMP agent checks on host "host1.domain.com": connection restored 98195:20240603:184827.582 SNMP agent item "citAvgLatencyUsecs[HYCU-cd12ff48-ecce-448f-9a57-f853483b9f7f.]" on host "host1.domain.com" failed: first network error, wait for 15 seconds 98215:20240603:184846.424 SNMP agent item "citAvgLatencyUsecs[HYCU-cd12ff48-ecce-448f-9a57-f853483b9f7f.]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98218:20240603:184850.429 SNMP agent item "dstIOBandwidth[17]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98215:20240603:184909.453 SNMP agent item "dstAverageLatency[7]" on host "host1.domain.com" failed: another network error, wait for 15 seconds 98218:20240603:184913.462 temporarily disabling SNMP agent checks on host "host1.domain.com": interface unavailable

Comment