View Full Version : Too many "Server xxx is unreachable" messages
I'm monitoring many server accross the web with a central "zabbix server" running zabbix_suckered alike monitored server that are running zabbix_agentd.
I'm looking for a way to reduce "host unreachable" sensitivity because our connexions are not really good.
My idea is to change the trigger "Server xxx is unreachable" to make it "ON" only when more than one faillure has occcured.
How can I do that ?
Thanks ;)
I thinks that you can use somthing like that :
for n value failure :
(({HOSTNAME:key.last(0)}=0)&({HOSTNAME:key.last(1)}=0)&...&({HOSTNAME:key.last(n)}=0))
Thanks
I'm trying :
(({_TEMPLATE.Unix:status.last(0)}=2)&({_TEMPLATE.Unix:status.last(1)}=2)&({_TEMPLATE.Unix:status.last(2)}=2)&({_TEMPLATE.Unix:status.last(3)}=2)&({_TEMPLATE.Unix:status.last(4)}=2))
If it's work as well as I want, I'll post a confirmation message.
Thanks
I'm trying :
(({_TEMPLATE.Unix:status.last(0)}=2)&({_TEMPLATE.Unix:status.last(1)}=2)&({_TEMPLATE.Unix:status.last(2)}=2)&({_TEMPLATE.Unix:status.last(3)}=2)&({_TEMPLATE.Unix:status.last(4)}=2))
If it's work as well as I want, I'll post a confirmation message.
Hi !
How do you configure status ? As internal check or agent ? It works ?
I can't do it =:-/ It doesn't shows the trigger when I shutdown the host.
[]'s
see following post http://www.zabbix.com/forum/showthread.php?t=597&highlight=unreachable i think this is what you are probably referring to. i never did manage to get it to work. what we do is we have a perl script that runs every minute from a cron job and returns a value if it can connect to our server's database and query a value.
i created a UserParameter on the zabbix server that checks this value every 30 secs. this trigger has a delay of 240secs so it will wait until then before triggering an alert. trigger looks like
({hostname:userparameter.last(0)}=0)&({hostname.userparameter.delta(240)}=0)
johnl
Hi.
I think I did find the answer. Well ... they are 1 answer for 2 problems I had :)
1st - the host goes to unreachable after 60 sec of downtime.
2nd - the host bounce between OK and DOWNTIME several times in the triggers list even if the host is always down.
I found the solution here (http://sourceforge.net/forum/forum.php?thread_id=1131433&forum_id=74299) but resuming, you must patch db.c from:
snprintf(sql,sizeof(sql)-1,"select distinct t.triggerid from hosts h,items i,triggers t,functions f where f.triggerid=t.triggerid and f.itemid=i.itemid and h.hostid=i.hostid and h.hostid=%d and i.key_<>'%s'",hostid,SERVER_STATUS_KEY);
to:
snprintf(sql,sizeof(sql)-1,"select distinct t.triggerid from hosts h,items i,triggers t,functions f where f.triggerid=t.triggerid and f.itemid=i.itemid and h.hostid=i.hostid and h.hostid=%d and i.key_ not in ('%s','%s','%s')",hostid,SERVER_STATUS_KEY, SERVER_ICMPPING_KEY, SERVER_ICMPPINGSEC_KEY);
[]'s