Hi all,
I have a problem with Zabbix Auto Registration actions.
I have a set of compute nodes I wish to register in my zabbix server.
First some definitions:
a) The nodes are nodeXX.sub.dom.tld where 00<=XX<=99
b) The nodes have IP addresses 10.10.0.YY where YY=XX+10
c) The 10.10/24 network goes through the 10.7/24 network (over 10.10.0.1/10.7.0.1 )
d) The 10.7/24 network conencts to the 10.1/24 network where the zabbix server (10.1.0.1) lives.
e) Communication from the zabbix server to node49 (our test subj in this post) is possible as proven by "echo system.uptime | nc node49.sub.dom.tld 10050" from the zabbix server, and get "ZBXD1407237" back.
f) Communication from node49 is also possible as proven by -> echo "ZBXDi{"request":"sender data","data":[{"host":"node49.sub.dom.tld","key":"system.uptime" ,"value":"1"}]}" | nc 10.1.0.1 10051 <- and receive "OK" back.
g) The ACTION is defined as such:
cond 1) Host metadata like: Linux
cond 2) Host metadata like: 00000000-0000-0000-0000-000000000000
cond 3) Host name like: node*.sub.dom.tld
and
oper 1) Send message to user groups: Zabbix administrators via all media
oper 2) Link to templates: Template OS Linux Cluster
h) Template OS Linux Cluster belongs to the group Cluster Nodes and incorporates the Template OS Linux template.
So, the all-important tl;dr part behind us, the message I get in the Zabbix server log is:
3814:20180521:222219.923 cannot send list of active checks to "10.7.0.1": host [node49.sub.dom.tld] not found
But here we already know that the request came from node49, correctly identifying itself as node49.sub.dom.tld and resolvable as 10.10.0.59.
So, WHY does 10.7.0.1 come there as the machine targeted for the active tests - and why is the auto registration failing?
I am, at the lack of a better word - flabbergasted, and would be immensely grateful if someone could help out
I have a problem with Zabbix Auto Registration actions.
I have a set of compute nodes I wish to register in my zabbix server.
First some definitions:
a) The nodes are nodeXX.sub.dom.tld where 00<=XX<=99
b) The nodes have IP addresses 10.10.0.YY where YY=XX+10
c) The 10.10/24 network goes through the 10.7/24 network (over 10.10.0.1/10.7.0.1 )
d) The 10.7/24 network conencts to the 10.1/24 network where the zabbix server (10.1.0.1) lives.
e) Communication from the zabbix server to node49 (our test subj in this post) is possible as proven by "echo system.uptime | nc node49.sub.dom.tld 10050" from the zabbix server, and get "ZBXD1407237" back.
f) Communication from node49 is also possible as proven by -> echo "ZBXDi{"request":"sender data","data":[{"host":"node49.sub.dom.tld","key":"system.uptime" ,"value":"1"}]}" | nc 10.1.0.1 10051 <- and receive "OK" back.
g) The ACTION is defined as such:
cond 1) Host metadata like: Linux
cond 2) Host metadata like: 00000000-0000-0000-0000-000000000000
cond 3) Host name like: node*.sub.dom.tld
and
oper 1) Send message to user groups: Zabbix administrators via all media
oper 2) Link to templates: Template OS Linux Cluster
h) Template OS Linux Cluster belongs to the group Cluster Nodes and incorporates the Template OS Linux template.
So, the all-important tl;dr part behind us, the message I get in the Zabbix server log is:
3814:20180521:222219.923 cannot send list of active checks to "10.7.0.1": host [node49.sub.dom.tld] not found
But here we already know that the request came from node49, correctly identifying itself as node49.sub.dom.tld and resolvable as 10.10.0.59.
So, WHY does 10.7.0.1 come there as the machine targeted for the active tests - and why is the auto registration failing?
I am, at the lack of a better word - flabbergasted, and would be immensely grateful if someone could help out

Comment