Hope you guys can help me out.
We have about 2000 hosts where 90% of them are installed as active monitoring using the Zabbix agent and the rest are mainly printers and network equipment using SNMP. We work in shifts which means that about 400 hosts or more can go offline per shift. Meaning that if the late shift comes up, the people from the morning shift close down their forklift terminals so the forklift can charge until a new morning shift starts. During these "offline" periods the Zabbix queue goes up quite high due to item values not being collected.
If we disable these hosts the queue comes back to normal off course, but we cannot do this manually each time. Is there any way I could prevent the queue to go up during these offline periods? I have tried to look at maintenance rules, but this doesn't work as we do not know which devices will go offline.
I have tried to write a script using the API, which works on the trigger "Zabbix agent not available" and puts the host in disabled state, but then it is not possible to re-enable it again automatically since the host is disabled and there is no recovery possible from the trigger. I thought auto-registration would take care of this adding "enable host" to the operations, but this doesn't work either. It seems that once the agent has registered, it doesn't re-register when coming back online.
Thanks!
Geoff.
We have about 2000 hosts where 90% of them are installed as active monitoring using the Zabbix agent and the rest are mainly printers and network equipment using SNMP. We work in shifts which means that about 400 hosts or more can go offline per shift. Meaning that if the late shift comes up, the people from the morning shift close down their forklift terminals so the forklift can charge until a new morning shift starts. During these "offline" periods the Zabbix queue goes up quite high due to item values not being collected.
If we disable these hosts the queue comes back to normal off course, but we cannot do this manually each time. Is there any way I could prevent the queue to go up during these offline periods? I have tried to look at maintenance rules, but this doesn't work as we do not know which devices will go offline.
I have tried to write a script using the API, which works on the trigger "Zabbix agent not available" and puts the host in disabled state, but then it is not possible to re-enable it again automatically since the host is disabled and there is no recovery possible from the trigger. I thought auto-registration would take care of this adding "enable host" to the operations, but this doesn't work either. It seems that once the agent has registered, it doesn't re-register when coming back online.
Thanks!
Geoff.
Comment