Ad Widget

**jan.garaj** · 16-12-2016, 21:08

I'm not sure if it's the best way: http://zabbix.org/wiki/Escaping_timeouts_with_atd

**Linwood** · 18-12-2016, 18:39

I've tried a number of experiments and all suffer from some problems, especially if you disconnect the polling process from zabbix, such as in cron.

One way you can approach this, though I only experimented slightly, was to:

- Create an external check for an item that essentially does nothing (or returns success/failure based on the following step). This drives the polling.

- Inside of that external check, fork to a new process twice (you have to fork twice as the second time disconnects it from the parent which will otherwise cause it to delete).

- In the forked process, do your work, and return the values via zabbix traps which are asynchronous.

If you do this, you only have one poll running that is not directly connected to zabbix (e.g. if you stop the server or remove a host/item), while still disconnecting the actual process from the time limits in zabbix. You do need to make sure that the poll frequency on the driving item is slower than the time it takes to run, otherwise these will build up out there as they do not finish as fast as they start. You could add a check in the first process pre-fork to look for a specific process name or other coordination mechanism to ensure that (and indeed let it pass back failure to the otherwise pointless item and you can trigger from it to show you have a timing issue).

**Jason** · 18-12-2016, 21:12

Currently I've got some cron jobs running and using user parameters on an agent. Whilst this is fine if monitoring a single host if I want to monitor more than 1 then I figure I need to use the proxy and external checks. It's a shame that the proxies don't support userparameter.

I think that if I want it to work this way I'll have to use external checks that fork to do the actual run and then return using the trapper making sure I pass enough information to the external script to ensure I can identify the return trapper items. I'll use a lock file whilst the script is running and the external file can look at the existence and age of this file. If the file is there and age less than Y then another task is running. If file exists and age > Y then it's crashed or hung.

Ad Widget

Best way of running tests that take a long time

Best way of running tests that take a long time

Comment

Comment

Comment