Hello all,
It would be great if zabbix_server and agent do better start with tcp_listen. Now it is started, zabbix_* will try to bind and if there is error, it dies. My suggestion is that it would be better to wait some time or do better checking. Sometime agent or server dies and leaves many connections in TIME_WAIT state. It takes some time before this connections are dropped and in this time, I have to wait before I try to run zabbix_* again. I have two suggestions, one of them is enaught:
- when try to listen, do some loop and try it 60 seconds. It should be enaught, each second try to bind again. After that time, die. I know this can do init script but we need exact return code for this scenario.
- better connection management when killing agent. I think there is some problem inside zabbix_agentd that when I kill it, it does not properly close connections. If it does this, there could not be TIME_WAIT connections after kill. Am I wrong?
Thanx to all!
It would be great if zabbix_server and agent do better start with tcp_listen. Now it is started, zabbix_* will try to bind and if there is error, it dies. My suggestion is that it would be better to wait some time or do better checking. Sometime agent or server dies and leaves many connections in TIME_WAIT state. It takes some time before this connections are dropped and in this time, I have to wait before I try to run zabbix_* again. I have two suggestions, one of them is enaught:
- when try to listen, do some loop and try it 60 seconds. It should be enaught, each second try to bind again. After that time, die. I know this can do init script but we need exact return code for this scenario.
- better connection management when killing agent. I think there is some problem inside zabbix_agentd that when I kill it, it does not properly close connections. If it does this, there could not be TIME_WAIT connections after kill. Am I wrong?
Thanx to all!
Comment