Ad Widget

Collapse

Zabbix Agent connection flapping on WIndows

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • rsterenb
    Member
    • Apr 2015
    • 31

    #1

    Zabbix Agent connection flapping on WIndows

    Hi,

    We're experiencing a lot of agent connection flapping. In agentd.log:

    [...]
    3920:20181017:075423.976 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:075523.898 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:081944.260 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:082044.182 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:082506.026 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:082606.963 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:090428.187 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:090528.109 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:090949.938 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:091049.860 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:093510.831 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:093610.753 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:094431.426 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:094531.348 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:094952.177 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:095052.083 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:100513.460 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:100613.382 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    3920:20181017:101434.070 active check configuration update from [zbx_proxy.domain.tld:10051] started to fail (cannot connect to [[zbx_proxy.domain.tld]:10051]: A connection timeout occurred.)
    3920:20181017:101535.008 active check configuration update from [zbx_proxy.domain.tld:10051] is working again
    [...]

    This goes on and on, flooding the agent log. As far as I've seen it happens only on Windows (2016) servers: I have not seen the agent on Linux (CentOS 6/7) servers that are on the same IP subnet (so, no firewall in between) show this in their log.

    The servers are VM's on VMware. When I place a Windows and Linux VM on the same (any) hypervisor, the agent connection on the Windows server will start flapping, but not on the Linux server.

    Our server/proxy is 3.2.11 (we were waiting for 4.x to happen..), the Windows agents are 3.2.7, and we run dual stack IPv4/IPv6. Hostnames resolve to both IPv4 and IPv6 addresses. When I setup a ping, the ping returns a stable pong.

    We've been searching for the cause for some time now, but can't find the issue. Can I get some pointers what to look for?
  • bbrendon
    Senior Member
    • Sep 2005
    • 870

    #2
    I seem to recall once upon a time event log monitoring could cause flapping. Maybe disable it?

    After looking at the log it doesn't sound like event log. Did you try manually checking the socket availability to the proxy? You probably have to increase a setting on the proxy to allow more connections.
    Unofficial Zabbix Expert
    Blog, Corporate Site

    Comment

    Working...