I am usig zabbix to monitor docker containers in docker hosts. I have a script in docker host which queries docker and returns values. It is called from zabbix agent using the following configuration
The scripts were taken from https://github.com/digiapulssi/zabbi...toring-scripts
I added a template with discovery rules to discover new containers, and a trigger prototype to alert when a container goes down. I have created a media type of type script and uses it in an action to send alert to external web server.
My problem is that when one container goes down I get an alert for all containers which are in discovered hosts.
Image shows the status before eager_blackwell goes down: If status is 1 container is up and zero when down.

When container eager_blackwell goes down, this is the alert I expect: (This json is given as the Default Message in Action -> Operation)

Please note that the hostName and itemKey have same host.
Along with this I also get alerts for other container which are present in the docker (or which have been removed previously, but its "Keep lost resources period" is not over yet) like the two samples below:
This container is not present in docker host, but zabbix maintains the state as its "Keep lost resources period" is not over.
As you can see the hostname in hostName and itemKey are different.
This one is present in the docker host and is still online as it can be seen from the screen shot above:
Over all I received 8 alerts while I expect only one.
I only want the alert to be generated only if hostName and itemKey have same host. I am not sure how to spedify it.
My trigger prototype is shown as below.
I have also attached the template as xml for review. Any help to get this working would be much appreciated.
Code:
UserParameter=docker.containers.discovery,/etc/zabbix/scripts/docker.sh discovery UserParameter=docker.containers.count,/etc/zabbix/scripts/docker.sh count UserParameter=docker.containers.discovery.all,/etc/zabbix/scripts/docker.sh discovery_all UserParameter=docker.containers.count.all,/etc/zabbix/scripts/docker.sh count_all # First parameter: container id # Second parameter: one of netin, netout, cpu, disk, memory, uptime, up or status UserParameter=docker.containers[*],/etc/zabbix/scripts/docker.sh "$1" "$2" UserParameter=docker.status[*],/etc/zabbix/scripts/docker.sh "$1" "$2" ####################################################################### # Compatibility with www.monitoringartist.com docker templates UserParameter=docker.discovery,/etc/zabbix/scripts/docker.sh discovery UserParameter=docker.up[*],/etc/zabbix/scripts/docker.sh "$1" up # Ignore the second argument for docker.cpu (system vs user) UserParameter=docker.cpu[*],/etc/zabbix/scripts/docker.sh "$1" cpu # Ignore the second argument for docker.mem (total_cache vs total_rss vs total_swap) UserParameter=docker.mem[*],/etc/zabbix/scripts/docker.sh "$1" memory
I added a template with discovery rules to discover new containers, and a trigger prototype to alert when a container goes down. I have created a media type of type script and uses it in an action to send alert to external web server.
My problem is that when one container goes down I get an alert for all containers which are in discovered hosts.
Image shows the status before eager_blackwell goes down: If status is 1 container is up and zero when down.
When container eager_blackwell goes down, this is the alert I expect: (This json is given as the Default Message in Action -> Operation)
Code:
{
"eventId": "5981",
"eventTime": "12:44:48",
"itemValue": "0",
"hostName": "eager_blackwell",
"triggerSeverity": "High",
"eventDate": "2018.04.20",
"triggerId": "17315",
"itemKey": "docker.containers[eager_blackwell, up]",
"triggerName": "Docker container down",
"itemName": "Container eager_blackwell up:",
"triggerUrl": "",
"triggerStatus": "PROBLEM"
}
Please note that the hostName and itemKey have same host.
Along with this I also get alerts for other container which are present in the docker (or which have been removed previously, but its "Keep lost resources period" is not over yet) like the two samples below:
This container is not present in docker host, but zabbix maintains the state as its "Keep lost resources period" is not over.
Code:
{
"eventId": "5988",
"eventTime": "12:44:56",
"itemValue": "0",
"hostName": "happy_franklin",
"triggerSeverity": "High",
"eventDate": "2018.04.20",
"triggerId": "17290",
"itemKey": "docker.containers[eager_blackwell, up]",
"triggerName": "Docker container down",
"itemName": "Container eager_blackwell up:",
"triggerUrl": "",
"triggerStatus": "PROBLEM"
}
This one is present in the docker host and is still online as it can be seen from the screen shot above:
Code:
{
"eventId": "5989",
"eventTime": "12:45:02",
"itemValue": "0",
"hostName": "tender_pasteur",
"triggerSeverity": "High",
"eventDate": "2018.04.20",
"triggerId": "17343",
"itemKey": "docker.containers[eager_blackwell, up]",
"triggerName": "Docker container down",
"itemName": "Container eager_blackwell up:",
"triggerUrl": "",
"triggerStatus": "PROBLEM"
}
I only want the alert to be generated only if hostName and itemKey have same host. I am not sure how to spedify it.
My trigger prototype is shown as below.
I have also attached the template as xml for review. Any help to get this working would be much appreciated.
(kind of "not dog and not an otter" syndrome
Comment