I have a (hopefully quick) question about web.page.regexp and timeouts. I have a trigger that looks like this.
If the check times out, we don't get data (as expected, hence the .nodata trigger), however, the .nodata trigger clears if the server starts sending data even if its responding with EOF (which is still a failure). This results in false positives. We need to have the trigger stay TRUE if .nodata, then when data comes back but failed. (I've played around with nodata at different time intervals but it doesn't fix this situation)
Here is a timeline
T1. Nodata seen for web.page.regexp
T2. Trigger fires on .nodata clause
T3. web.page.regexp returns data; value = EOF
T4. Nodata clears, all actions cleared, escalation clears
T5. web.page.regexp still returns EOF
T6. web.page.regexp still returns EOF
T7. Trigger fires on .web.page.regexp clause, alerts sent
T8. people get escalations and fix it
T9. All triggers clear, we are happy
My issue is at T4 and the time to T7, even though the trigger clears, its not really fixed. I think changing .count(#3,EOF)}=3 to .count(#1,EOF)}=1 will do it, but this also could introduce some false positives depending on the load of the server or quick restarts of the app for whatever reason. Is there anything else I can try?
Code:
item interval = 120 secs
{server:web.page.regexp[server,/path/monitor.display,8080,"Smoke test = success"].count(#3,EOF)}=3
|
{server:web.page.regexp[server,/path/monitor.display,8080,"Smoke test = success"].nodata(360)}=1
Here is a timeline
T1. Nodata seen for web.page.regexp
T2. Trigger fires on .nodata clause
T3. web.page.regexp returns data; value = EOF
T4. Nodata clears, all actions cleared, escalation clears
T5. web.page.regexp still returns EOF
T6. web.page.regexp still returns EOF
T7. Trigger fires on .web.page.regexp clause, alerts sent
T8. people get escalations and fix it
T9. All triggers clear, we are happy
My issue is at T4 and the time to T7, even though the trigger clears, its not really fixed. I think changing .count(#3,EOF)}=3 to .count(#1,EOF)}=1 will do it, but this also could introduce some false positives depending on the load of the server or quick restarts of the app for whatever reason. Is there anything else I can try?
Comment