Ad Widget

**Alexei** · 12-01-2009, 10:47

External script must be written so that it will timeout, exit and clear all associated resources without Zabbix intervention. Sure, we can make Zabbix kill timeouted script, but this won't solve the problem. The script may do fork(), system(), whatever, so killing just one process is not a proper solution anyway.

**Alexei** · 12-01-2009, 10:50

By the way, same applies to user parameters. It is Zabbix administrator responsibility to make sure that all the scripts handles timeouts properly, clear resources, temporary files, etc etc.

**lamont** · 12-01-2009, 19:50

Yes, but sometimes you can get bitten.

I had a simple script using perl LWP which set timeouts on the LWP connection which works 99.9% of the time, but I've found some edge case, which I think is only during SSL negotiation which will hang indefinitely.

This means that I just need very paranoid explicit timeouts around the external script (like I said, fork() to a child and then have the parent timeout and kill the child if it hangs).

It would be good if Zabbix would close the LISTEN socket, though, before forking off the process, otherwise you inherit a file descriptor you do not expect. That can be mitigated by simply closing that socket in the external script, but it surprised me to find it was holding the socket open.

**alixen** · 14-01-2009, 18:37

Originally posted by lamont

Yes, but sometimes you can get bitten.

I had a simple script using perl LWP which set timeouts on the LWP connection which works 99.9% of the time, but I've found some edge case, which I think is only during SSL negotiation which will hang indefinitely.

I have got the same problem with wget and SSL and I have found a nice shell wrapper that has solved my problem.

I found it there : http://www.pixelbeat.org/scripts/timeout

Hope this helps
Alixen

**steev** · 23-02-2011, 19:30

this condition hangs the whole agent.

I have a curl check that was hanging and I noticed that this stopped the agent from doing ANYTHING.

I discovered this through another zabbix agent (active) check that sends localtime every thirty seconds. I use this as a 'heartbeat'. If I get nodata(600) from this check then a trigger pages me because the server isn't responding.

I've also encountered a condition where an init script that I run via an action to restart a process has completely hung up the zabbix agent.

I can see the point about how the scripts should time themselves out in a reasonable amount of time but if they don't, I really don't think that it should break the whole zabbix agent.

Ad Widget

If externalscripts hang...

If externalscripts hang...

Comment

Comment

Comment

Comment

Comment