I have a monitoring problem that I would like some help with. My environment consists in part of a small number of binaries that each run as daemon processes against a large number of database schemas. The challenge is how to know which one has a problem without killing the server with monitoring!
I have tried a single item/trigger pair that counts the number of running processes for each binary and alerts when that number changes. It is good on processor resources, but it doesn't give much diagnostic information or help in fixing the problem.
I have also tried a *LOT* of proc.num[binary,,,parm] item/trigger pairs that will tell me which went down, but the processor gets hit harder, the entries are tedious, and you can easily get tens of actions firing under this scenario.
Any thoughts on how to best monitor an environment like this?
Stanzoid
I have tried a single item/trigger pair that counts the number of running processes for each binary and alerts when that number changes. It is good on processor resources, but it doesn't give much diagnostic information or help in fixing the problem.
I have also tried a *LOT* of proc.num[binary,,,parm] item/trigger pairs that will tell me which went down, but the processor gets hit harder, the entries are tedious, and you can easily get tens of actions firing under this scenario.
Any thoughts on how to best monitor an environment like this?
Stanzoid