Sorry if this has been answered already but I still can't find any good documentation on how to setup an proxy, get the hosts talking to it and then have the proxy send alerts to the main server. I have been testing zabbix for a bit now and I can't seem to get the proxy to work, one to one it works great. I have read the documentation from zabbix but the proxy info was lacking to say the least. To give a idea of the setup that will be running is, we are a consulting firm that will be monitoring our clients and thats where the proxy comes in. In my testing I have both the server and proxy on ubuntu 9.0.4, the version of zabbix is 1.6.1-3.
Ad Widget
Collapse
Looking for a zabbix proxy expert
Collapse
X
-
I am by no means an expert... but here are 2 posts that will help get you pointed in the right direction.
First is the (Ubuntu 8.0.4) compile instructions for using the proxy:
The second is a quick and dirty config for the proxies:
-
I'm not going to look it up, but just go by memory.
From what I recall, you basically setup a proxy similarly as your would an active agent and configure the agents with the proxy as you would with the zabbix server. On the zabbix server you assign the host to a proxy.
That's pretty much it! It works wonderfully.Comment
-
So far I have tried all that with no luck and I feel I just missing something simple that I am over looking. The main issue I am having right is is when I switch a host to be monitored by the proxy in the gui it never changes to offline even if the nic is turned off or even on a different subnet. I have updated the time from two hours to 2 mins but even that doens't matter since last night I moved the proxy from one subnet to a different one than my test machine was on and in the gui it never updated. So my direct questions are the following:
1. how does the main server know to receive the data from the proxy since there is no way to set any info except a name?
2. when adding a host that will be monitored by a proxy what ip do you use the proxy or the actual ip of the machine?
3. can the proxy be on the same subnet as the main server for testing?Comment
-
1. In the proxy config you set a server hostname/ip address. This is your main server. The proxy 'name' should be configured in the proxy config and the main server (this is the identification).
2.Clients should have the proxy server configured as 'server'
3. yes this should pose no problem.Comment
-
I think that is how I have it but here are the config files also the last bit of the server log (i removed the commented out txt to make it fit)
zabbix_server.conf:
NodeID=1
#StartPollers=5
#StartPollersUnreachable=1
#StartTrappers=5
#StartPingers=1
#StartDiscoverers=1
#StartHTTPPollers=1
ListenPort=10051
ListenIP=192.168.1.12
#HousekeepingFrequency=1
SenderFrequency=30
#DisableHousekeeping=1
DebugLevel=3
Timeout=5
#TrapperTimeout=5
#UnreachablePeriod=45
#UnavailableDelay=15
#UnavailableDelay=60
PidFile=/var/run/zabbix-server/zabbix_server.pid
LogFile=/var/log/zabbix-server/zabbix_server.log
#LogFileSize=1
AlertScriptsPath=/etc/zabbix/alert.d/
#FpingLocation=/usr/sbin/fping
#PingerFrequency=60
DBHost=localhost
DBName=zabbix
DBUser=zabbix
DBPassword=*********
#DBSocket=/tmp/mysql.sock
zabbix_proxy.conf:
Server=192.168.1.12
ServerPort=10051
Hostname=proxy
#StartPollers=5
#StartIPMIPollers=0
#StartPollersUnreachable=1
#StartTrappers=5
#StartPingers=1
#StartDiscoverers=1
#StartHTTPPollers=1
#ListenPort=10051
#SourceIP=192.168.1.1
ListenIP=192.168.1.13
#HeartbeatFrequency=60
ConfigFrequency=120
#HousekeepingFrequency=1
#SenderFrequency=30
#ProxyLocalBuffer=0
#ProxyOfflineBuffer=1
#DebugLevel=3
Timeout=5
#TrapperTimeout=5
#UnreachablePeriod=45
#UnavailableDelay=15
#UnavailableDelay=60
PidFile=/var/run/zabbix-proxy/zabbix_proxy.pid
LogFile=/var/log/zabbix-proxy/zabbix_proxy.log
#LogFileSize=1
AlertScriptsPath=/home/zabbix/bin/ "This path does not exists does it need to send the alerts to the server?"
#ExternalScripts=/etc/zabbix/externalscripts
#FpingLocation=/usr/sbin/fping
#Fping6Location=/usr/sbin/fping6
#TmpDir=/tmp
#PingerFrequency=60
DBHost=localhost
DBName=zabbix_proxy
DBUser=zabbix_proxy
DBPassword=**********
#DBSocket=/tmp/mysql.sock
zabbix_agentd.conf:
Server=192.168.1.13
ServerPort=10051
Hostname=famlpt
ListenPort=10050
#ListenIP=127.0.0.1
#SourceIP=
StartAgents=3
#RefreshActiveChecks=120
#DisableActive=1
#DisablePassive=1
#EnableRemoteCommands=1
DebugLevel=3
#PidFile=/var/tmp/zabbix_agentd.pid
LogFile=/tmp/zabbix_agentd.log
#LogFileSize=1
Timeout=3
####### USER-DEFINED MONITORED PARAMETERS #######
# Format: UserParameter=<key>,<shell command>
# Note that shell command must not return empty string or EOL only
#UserParameter=system.test,who|wc -l
### Set of parameter for monitoring MySQL server (v3.23.42 and later)
### Change -u<username> and add -p<password> if required
#UserParameter=mysql.ping,mysqladmin -uroot ping|grep alive|wc -l
#UserParameter=mysql.uptime,mysqladmin -uroot status|cut -f2 -d":"|cut -f1 -d"T"
#UserParameter=mysql.threads,mysqladmin -uroot status|cut -f3 -d":"|cut -f1 -d"Q"
#UserParameter=mysql.questions,mysqladmin -uroot status|cut -f4 -d":"|cut -f1 -d"S"
#UserParameter=mysql.slowqueries,mysqladmin -uroot status|cut -f5 -d":"|cut -f1 -d"O"
#UserParameter=mysql.qps,mysqladmin -uroot status|cut -f9 -d":"
#UserParameter=mysql.version,mysql -V
zabbix_server log
30208:20090508:064810 Unknown proxy "proxy"
30208:20090508:064811 Unknown proxy "proxy"
30208:20090508:064812 Unknown proxy "proxy"
30208:20090508:064813 Unknown proxy "proxy"
30204:20090508:064814 Unknown proxy "proxy"
30207:20090508:064815 Unknown proxy "proxy"
30206:20090508:064816 Unknown proxy "proxy"
30205:20090508:064817 Unknown proxy "proxy"
30207:20090508:064818 Unknown proxy "proxy"
30206:20090508:064819 Unknown proxy "proxy"
30205:20090508:064821 Unknown proxy "proxy"
30204:20090508:064822 Unknown proxy "proxy"
30208:20090508:064823 Unknown proxy "proxy"
30204:20090508:064824 Unknown proxy "proxy"Comment
-
You are running a distributed setup?
According to your configs, your Zabbix server is 192.168.1.12 and your proxy is 192.168.1.13, correct?
Through the Zabbix GUI, you added the proxy through Configuration --> Hosts... changed the dropdown to "Proxies" and then added the proxy through "create proxy" ?
The important thing here is that the name match what you have set in the Hostname= field of your zabbix_proxy.conf file... in your case, "proxy".
Those all correct statements?Last edited by tchjts1; 08-05-2009, 18:12.Comment
-
Yes you are correct the host name for the proxy is "proxy" and in the gui I created a proxy with the name proxy. I changed it from zabbix_proxy to make sure that it wasn't due too the use of a "_" in the name. I am using the zabbix from the ubuntu packages but it didn't matter when I used the zabbix from the main download page I was having the same issue. The server is at .12 and the proxy is at .13 I have also tried to put the proxy behind a router to see if being on a different subnet would make a difference which it didn't. I can get the server to talk to the agent on the proxy but not collect any info on the host that the proxy is monitoringComment
-
If you made any changes to the proxy conf, did you restart the proxy?
What is it showing for "last seen" for the proxy? You can see that info as if you were going to create a new proxy. It will show this info:
Name ...............Last seen (age) ...........# .........Members
proxyname...................30s.................. 0...............-
"Last seen" value will let you know there is communication between the proxy and the server.
Yoiu didn't mention if you were using a distibuted setup? I ask because I see you have NodeID=1 enabled in your server configuration.
I also see you have "ListenIP" specified. Might try commenting that out. I don't use distributed, so I am not sure about that requirement and it is for trapper, but I don't specify an IP to listen to. By default, it is commented out and listens on all interfaces.
Any conf changes require a restart of that service.Last edited by tchjts1; 08-05-2009, 19:37.Comment
-
As per what you said I took out the listen ip info and restarted the services and yes I restart the services when I do makes changes. As for the node id change was done to see if that would make a difference with I also just changed back to 0. The last seen was 2s when I just checked. As for the host that is being monitored by a proxy do you put the ip of the host or the proxy? I wanted a bit and turn off the nic on the host and here is the proxy log and than turned it back on, the issue is the main server never showed it offline and when I checked the log for the server it showed this
12667:20090508:110440 Sending configuration data to proxy. Datalen 3976
15045:20090508:105022 Starting zabbix_proxy. ZABBIX 1.6.1.
15045:20090508:105022 **** Enabled features ****
15045:20090508:105022 SNMP monitoring: NO
15045:20090508:105022 WEB monitoring: YES
15045:20090508:105022 ODBC: NO
15045:20090508:105022 IPv6 support: NO
15045:20090508:105022 **************************
15047:20090508:105022 server #2 started [Datasender]
15046:20090508:105022 server #1 started [Configuration syncer]
15048:20090508:105022 server #3 started [Poller. SNMP: NO]
15049:20090508:105022 server #4 started [Poller. SNMP: NO]
15050:20090508:105022 server #5 started [Poller. SNMP: NO]
15051:20090508:105023 server #6 started [Poller. SNMP: NO]
15052:20090508:105023 server #7 started [Poller. SNMP: NO]
15053:20090508:105023 server #8 started [Trapper]
15054:20090508:105023 server #9 started [Trapper]
15055:20090508:105023 server #10 started [Trapper]
15057:20090508:105023 server #11 started [Trapper]
15058:20090508:105023 server #12 started [Trapper]
15060:20090508:105023 server #13 started [ICMP pinger]
15063:20090508:105023 server #14 started [Housekeeper]
15063:20090508:105023 Executing housekeeper
15065:20090508:105023 server #15 started [Poller for unreachable hosts. SNMP: NO]
15067:20090508:105023 server #16 started [HTTP Poller]
15069:20090508:105023 server #17 started [HTTP Poller]
15063:20090508:105023 Deleted 0 records from history [0.017665 seconds]
15070:20090508:105023 server #18 started [HTTP Poller]
15072:20090508:105023 server #19 started [HTTP Poller]
15074:20090508:105023 server #20 started [HTTP Poller]
15076:20090508:105023 server #21 started [Discoverer. SNMP: NO]
15045:20090508:105023 server #0 started [Heartbeat sender]
15065:20090508:105048 Enabling host [famlpt]
15049:20090508:110041 Timeout while answering request
15049:20090508:110041 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15049:20090508:110041 Host [famlpt]: first network error, wait for 15 seconds
15049:20090508:110041 Parameter [system.cpu.load[,avg1]] will be checked after 20 seconds on host [famlpt]
15050:20090508:110042 Timeout while answering request
15050:20090508:110042 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15050:20090508:110042 Host [famlpt]: first network error, wait for 15 seconds
15050:20090508:110042 Parameter [system.cpu.load[,avg5]] will be checked after 40 seconds on host [famlpt]
15051:20090508:110043 Timeout while answering request
15051:20090508:110043 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15051:20090508:110043 Host [famlpt]: first network error, wait for 15 seconds
15051:20090508:110043 Parameter [perf_counter[\System\File Read Bytes/sec]] will be checked after 120 seconds on host [famlpt]
15052:20090508:110044 Timeout while answering request
15052:20090508:110044 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15052:20090508:110044 Host [famlpt]: first network error, wait for 15 seconds
15052:20090508:110044 Parameter [perf_counter[\System\File Write Bytes/sec]] will be checked after 120 seconds on host [famlpt]
15048:20090508:110045 Timeout while answering request
15048:20090508:110045 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15048:20090508:110045 Host [famlpt]: first network error, wait for 15 seconds
15048:20090508:110045 Parameter [perf_counter[\System\threads]] will be checked after 120 seconds on host [famlpt]
15065:20090508:110108 Timeout while answering request
15065:20090508:110108 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15065:20090508:110108 Host [famlpt]: another network error, wait for 15 seconds
15065:20090508:110128 Timeout while answering request
15065:20090508:110128 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [Interrupted system call]
15065:20090508:110128 Host [famlpt]: another network error, wait for 15 seconds
15065:20090508:110146 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [No route to host]
15065:20090508:110146 Host [famlpt] will be checked after 60 seconds
15065:20090508:110249 Get value from agent failed. Error: Cannot connect to [192.168.1.100:10050] [No route to host]
15065:20090508:110349 Enabling host [famlpt]Comment
-
In the gui on the server do you put the ip of the host or the proxy when you are adding them in as a host being monitored by a proxy.
Here is the agent log from the host:
5760:20090507:111122 zabbix_agentd started. ZABBIX 1.4.4.
5716:20090507:111122 zabbix_agentd listener started
5604:20090507:111122 zabbix_agentd active check started [192.168.1.13:10051]
3648:20090507:111122 zabbix_agentd collector started
5604:20090507:112522 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:112622 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:112722 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:112822 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:112922 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113022 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113122 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113222 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113322 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113422 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113522 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113622 Getting list of active checks failed. Will retry after 60 seconds
5604:20090507:113722 Getting list of active checks failed. Will retry after 60 seconds
5604:20090508:013130 Getting list of active checks failed. Will retry after 60 seconds
5604:20090508:082911 Getting list of active checks failed. Will retry after 60 seconds
5604:20090508:104825 Getting list of active checks failed. Will retry after 60 seconds
5604:20090508:110125 Getting list of active checks failed. Will retry after 60 seconds
5604:20090508:110225 Getting list of active checks failed. Will retry after 60 secondsComment

Comment