View Full Version : Problem with Windows-Agent from 1.1Beta5 - Lots of connections in wait-state
Hello,
I'm using zabbix 1.1beta 5 on FC4. I have problems with some Windows hosts with multiple connections on port 10050 in state TIME_WAIT, not only 1 or 5 set in the config but on one system (Win 2000 Server) 133 connections and on the second system (Win 2003 Server) 65+.
Here's my zabbix_agentd.conf:
Server=192.168.52.40
#ServerPort=10051
Hostname=win2k3server.domain.de
#ListenPort=10050
#ListenIP=192.168.52.229
StartAgents=1
#RefreshActiveChecks=120
DisableActive=1
DebugLevel=3
#PidFile=/var/tmp/zabbix_agentd.pid
#LogFile=/tmp/zabbix_agentd.log
Timeout=3
# NoTimeWait=1
Why exactly is there a comment
##### Experimental options. Use with care ! #####
before
# NoTimeWait=1
What can happen if I uncomment NoTimeWait? Does anybody have experience with this setting?
Thank you
This option only affects FreeBSD systems. It helps in this case but will have no affect on Windows.
Not sure why you have so many TIME_WAIT connections though. Have you looked at TCP limit patch? Perhaps it is stalling connections. Just a guess.
Having many connections in TIMEWAIT state is perfectly ok.
Having many connections in time_wait is perfectly ok?
This is a summary of the information i get out of netstat:
Connections:
ESTABLISHED 2
SYN_SENT 0
SYN_RECV 3
FIN_WAIT1 3
FIN_WAIT2 0
TIME_WAIT 1947
CLOSE 0
CLOSE_WAIT 0
LAST_ACK 0
LISTEN 0
CLOSING 2
UNKNOWN 0
Is 1947 connections in state TIME_WAIT also perfectly ok?
The server doesn't use more than 20Kb/s of traffic, but the latency on the line is heavy due the router having problems handling this high amount of connections.
Isn't there some way around these huge amounts of TIME_WAIT connections? Why are the connections in this state, why can't they simple be closed?
I am monitoring ~175 windows machines with active checks.
I have often wondered why Zabbix leaves its connections in a TIME_WAIT state as well. I found an article on the net that had this to say about it.
TIME_WAIT setting
On a busy web server, many sockets may linger in the TIME_WAIT state. This is caused by improperly coded client applications that do not properly shut down a socket. This can also be used as a type of DDoS attack.
You can alter the timeout on most hosts. For example:
Solaris
/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 60000 (in milliseconds)
The parameter name was corrected in Solaris 7 and higher. Prior to Solaris 7, the parameter was incorrectly labeled as tcp_close_wait_interval.
HP-UX
ndd -set /dev/tcp tcp_time_wait_interval 60000 (in milliseconds)
Linux kernel 2.2
/sbin/sysctl -w net.ipv4.vs.timeout_timewait=60 (in seconds)
The full article can be found at:
http://www.cymru.com/Documents/ip-stack-tuning.html
In Alexei's defense though I did find another post that sums up the significanse of these TIME_WAIT connections.
http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.1/1318.html
What I don't like about them is that when I need to make a config change to an agent (for example adding a new user defined parameter) the agent wont start untill all the TIME_WAIT connections have timed out.. complaining that something is already using that port. Maybe coding in the flexibility to do a kill -HUP [zabbix_agentd] would resolve this issue, but I would still like to know what benefit of leaving the connection in a TIME_WAIT state gives over closing the connection.
##### Experimental options. Use with care ! #####
# NoTimeWait=1
What can happen if I uncomment NoTimeWait? Does anybody have experience with this setting?
Thank you
To answere this:
trial:/home/marc/zabbix-1.1/src> grep -r 'NoTimeWait' *
zabbix_agent/zabbix_agentd.c:/* {"NoTimeWait",&CONFIG_NOTIMEWAIT,0,TYPE_INT,PARM_OPT,0,1},*/
Binary file zabbix_agent_win32/Debug/ZabbixW32.exe matches
zabbix_agent_win32/config.cpp: else if ((!stricmp(buffer,"PidFile"))||(!stricmp(buffer,"NoTimeWait"))||
Binary file zabbix_agent_win32/Release/ZabbixW32.exe matches
zabbix_agent_win32/doc/ReadMe.txt: NoTimeWait
zabbix_server/server.c:/* {"NoTimeWait",&CONFIG_NOTIMEWAIT,0,TYPE_INT,PARM_OPT,0,1},*/
zabbix_snmptrapper/zabbix_snmptrapper.c: {"NoTimeWait",&CONFIG_NOTIMEWAIT,0,TYPE_INT,PARM_OPT,0,1},
its commented out in source and maybe should be removed from conf to avoid confusion?
just writing this to avoid confusion if someone maybe searching for this var in forum. just wastet 2h on it.