I've been trying to get some monitoring within a cluster working for some time now and something that seems like it should be easy is giving me a a hell of a time. I'm using Zabbix 3.0.6. The nodes are running red hat.
Within the cluster, each node has to be able to resolve a ping to 5 hostnames mapped in /etc/hosts. My tests to see if the nodes are reachable is commenting those mappings out and reactivating to cancel any alarms.
For my item, I originally used agent.ping but quickly learned that was a self ping. I switched to system.run[ping <hostname> | head -2]. However, when I comment out the mapping in /etc/hosts, the item gets grayed out in the Monitoring>Latest Data tab (not sure if this is why my tests haven't gone well).
I've tried a couple different triggers:
1) .str(ping: unknown host <hostname>)}=1 I tried to have this guy take the results of a failed ping and execute the trigger if that particular string was listed.
2) nodata(300)}=1 I tried to have this guy receive nodata for 5 minutes but the nodata variable only works well with the agent.ping item.
3) iregexp(@ping: unknown host)}=1 I didn't build this expression, rather, this was the original trigger that wasn't working. Couldn't figure out the difference between iregexp and regexp.
I was also wondering if I could change my system.run ping item head to 3 from 2, when the hostname isn't commented out, it should return 3 lines of text. When the hostname would be commented out, it should return 1 line of text. Maybe theres a trigger that can count the output? or I could put 'wc -l' and possibly compare an integer of 3 when the hostname is enabled and 1 when the hostname is disabled.
Kind of rambling and not sure I got my question across but if you need any clarification feel free to ask.
Within the cluster, each node has to be able to resolve a ping to 5 hostnames mapped in /etc/hosts. My tests to see if the nodes are reachable is commenting those mappings out and reactivating to cancel any alarms.
For my item, I originally used agent.ping but quickly learned that was a self ping. I switched to system.run[ping <hostname> | head -2]. However, when I comment out the mapping in /etc/hosts, the item gets grayed out in the Monitoring>Latest Data tab (not sure if this is why my tests haven't gone well).
I've tried a couple different triggers:
1) .str(ping: unknown host <hostname>)}=1 I tried to have this guy take the results of a failed ping and execute the trigger if that particular string was listed.
2) nodata(300)}=1 I tried to have this guy receive nodata for 5 minutes but the nodata variable only works well with the agent.ping item.
3) iregexp(@ping: unknown host)}=1 I didn't build this expression, rather, this was the original trigger that wasn't working. Couldn't figure out the difference between iregexp and regexp.
I was also wondering if I could change my system.run ping item head to 3 from 2, when the hostname isn't commented out, it should return 3 lines of text. When the hostname would be commented out, it should return 1 line of text. Maybe theres a trigger that can count the output? or I could put 'wc -l' and possibly compare an integer of 3 when the hostname is enabled and 1 when the hostname is disabled.
Kind of rambling and not sure I got my question across but if you need any clarification feel free to ask.