Hi,
Running zabbix 5.0.3, and for some devices, using snmpv3 with authpriv (sha/aes), receiving "failed: first network error, wait for 15 seconds" routinely, then "resuming SNMP agent checks on host: connection restored" several times a minute, and occasionally the devices got to unavailable. However, a constant CLI snmp get poll running from the same server to the same network device every 3 seconds successfully retrieves and never times out for specific OIDs (
System health looked ok, poller ave was 24%, but tried bumping up start pollers from 8 to 20, this had no impact.
Reloaded cache, restarted zabbix server, no effect. Server is a VM, with 8GB RAM. Server doesn't seem to be running hot at all. New Values per second = ~160. Number of hosts 169, number of items around 18000.
When the snmp polling type was changed to v2, the device stopped throwing up timeouts. The device itself never showed any errors for either v3 or v2, and isn't under load.
Checked all bug fixes since 5.0.3 to 5.0.41 and didnt see anything relevant. There was some memory optimizations for pollers that looked interesting, but the server isn't mem challenged.
Any Ideas, thanks?
Running zabbix 5.0.3, and for some devices, using snmpv3 with authpriv (sha/aes), receiving "failed: first network error, wait for 15 seconds" routinely, then "resuming SNMP agent checks on host: connection restored" several times a minute, and occasionally the devices got to unavailable. However, a constant CLI snmp get poll running from the same server to the same network device every 3 seconds successfully retrieves and never times out for specific OIDs (
"1.3.6.1.6.3.10.2.1.1.0"
"1.3.6.1.6.3.10.2.1.2.0"
"1.3.6.1.6.3.10.2.1.3.0"
)System health looked ok, poller ave was 24%, but tried bumping up start pollers from 8 to 20, this had no impact.
Reloaded cache, restarted zabbix server, no effect. Server is a VM, with 8GB RAM. Server doesn't seem to be running hot at all. New Values per second = ~160. Number of hosts 169, number of items around 18000.
When the snmp polling type was changed to v2, the device stopped throwing up timeouts. The device itself never showed any errors for either v3 or v2, and isn't under load.
Checked all bug fixes since 5.0.3 to 5.0.41 and didnt see anything relevant. There was some memory optimizations for pollers that looked interesting, but the server isn't mem challenged.
Any Ideas, thanks?