View Full Version : Many TIME_WAIT connection
Hi all,
I saw many (over 50 in total) time_wait connection from zabbix client to zabbix server, as below
TCP zabbixclient:10050 zabbixserver:53944 TIME_WAIT
TCP zabbixclient:10050 zabbixserver:53947 TIME_WAIT
TCP zabbixclient:10050 zabbixserver:53960 TIME_WAIT
....
....
....
is it normal? is one connection above represent one monitored item?
Thanks,
BEE
I'm having the same "Problem".
Have my zabbix server (zobel) on Solaris, here is what i get with "netstat -a":
zobel.zabbix_agent zobel.52877 49152 0 49152 0 TIME_WAIT
zobel.zabbix_trap SRV-VM-WSUS-01.bvch.ch.1520 65476 0 49640 0 TIME_WAIT
zobel.zabbix_agent zobel.52875 49152 0 49178 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2262 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2330 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2329 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2328 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2327 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2326 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2325 64510 0 49680 0 TIME_WAIT
zobel.zabbix_agent zobel.52829 49152 0 49173 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2324 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2323 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2322 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2321 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2320 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2319 64510 0 49680 0 TIME_WAIT
zobel.52885 ora1LD2P.bvch.ch.12051 49680 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2318 64510 0 49680 0 TIME_WAIT
zobel.zabbix_agent zobel.52884 49152 0 49181 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2317 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2316 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2315 64510 0 49680 0 TIME_WAIT
zobel.52882 ora1LD1P.bvch.ch.12051 49680 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2314 64510 0 49680 0 TIME_WAIT
zobel.zabbix_agent zobel.52881 49152 0 49181 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2313 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2312 64510 0 49680 0 TIME_WAIT
zobel.zabbix_trap SRV-HS-REM-07.bvch.ch.2311 64510 0 49680 0 TIME_WAIT
Some connections were made by zabbix agents ( windows) and i guess it are items of type "zabbix Agent (active)" ( monitoring windows eventlog) and definetly items of type "Zabbix Agent" and key "net.tcp.port"
It looks like the socket opened has not been closed correctly.
Can some developer get a look at it ?
greets
Patrick
Palmertree
08-11-2007, 17:51
Had the same problem but it was an OS issue. Fixed it by modifing the sysctl.conf file as follows:
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
Hi P.,
tahnx for the suggestion, I can't seem to find the file on Solaris (on solaris one should use the NDD Command).
Anyway as far as I know it is a application problem. If the application opens a tcp-socket, the application should close the socket as well and not wait for the OS to close the port!!! (See http://rfc.sunsite.dk/rfc/rfc793.html)
Till this issue will be fixed, I think changing the TIME-WAIT interval is a good work-around,
thanx again,
Patrick
Anyway as far as I know it is a application problem.
You are wrong :) There is no problem, really... Just run a netstat for a busy TCP based server (Apache) to see bunch of sockets in a TIME_WAIT state.
Which maybe prooves that some people are not able to write "decent" Applications ;) ( reusing/opening Ports, binding ports, closeing ports ... )
bbrendon
12-11-2007, 18:35
My zabbix server has between 250 to 500 connections in TIME_WAIT.
Clients have about 0 to 15 connections in TIME_WAIT.
...though I'm not sure this is a bad thing. I have started monitoring this TIME_WAIT issue though because I'm trying to solve a problem where agents seem to go to sleep for about 15 minutes causing false positives and large amounts of panic :)