Ad Widget

Collapse

Help Needed with Active Checks

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • heinz
    Junior Member
    • Jan 2014
    • 2

    #1

    Help Needed with Active Checks

    Here is the scenario, I have a Zabbix Server in the cloud as well as many application servers which I plan to monitor with my Zabbix server.

    My hosts can always reach the Zabbix server but they can't always be reached by the server, thus making active health checks a perfect use case for my environment.

    Environment Details:
    AWS (multiple locations)

    Zabbix server: 50.50.50.50:10051 (public IP and Port)
    www1 server 110.10.10.10 (hostname = www1)
    www1 can telnet to 50.50.50.50:10051 without a problem
    Zabbix server can not reach 110.10.10.10

    zabbix config on www1:
    ServerActive=50.50.50.50:10051
    Hostname=www1 (have it hard set now, was prior using null with HostnameItem=system.hostname option since box is configured as www1).
    RefreshActiveChecks=60
    I have debug set to highest level as well:
    Logs on the agent host simply say:
    8121:20140118:223810.416 active checks #1 [getting list of active checks]
    8121:20140118:223810.416 In refresh_active_checks() host:'50.50.50.50' port:10051
    8121:20140118:223810.418 sending [{
    "request":"active checks",
    "host":"www1"}]
    8121:20140118:223810.418 before read
    8121:20140118:223810.421 got [{
    "response":"success",
    "data":[]}]
    8121:20140118:223810.421 In parse_list_of_checks()
    8121:20140118:223810.421 In disable_all_metrics()
    8121:20140118:223810.421 End of refresh_active_checks():SUCCEED
    8121:20140118:223810.421 active checks #1 [processing active checks]
    8121:20140118:223810.421 In process_active_checks('54.193.85.139',10051)
    8121:20140118:223810.421 End of process_active_checks()
    8121:20140118:223810.421 In get_min_nextcheck()
    8121:20140118:223810.422 active checks #1 [idle 1 sec]
    8120:20140118:223810.721 collector [processing data]
    8120:20140118:223810.725 In update_cpustats()
    8120:20140118:223810.725 End of update_cpustats()
    8120:20140118:223810.725 collector [idle 1 sec]
    8121:20140118:223811.422 In send_buffer() host:'50.50.50.50' port:10051 values:0/100
    This goes on for pages once starting the service.


    So now my host is ready to send to Zabbix Server, so I go onto the Zabbix server and goto configuration and add a new host.

    I use the same string for host name 'www1', as per some documents I put the IP of the agent as 0.0.0.0 and set monitored to on, this ends up having the zabbix server duplicating it's local monitoring agent as that server.

    If I set the IP to that of the server, or where the server will be appearing to come form, nothing happens. The logs on the zabbix server say this:


    1366:20140118:223849.470 Zabbix agent item "net.if.in[eth0]" on host "www1" failed: first network error, wait for 15 seconds
    1368:20140118:223904.252 Zabbix agent item "net.if.in[eth0]" on host "www1" failed: another network error, wait for 15 seconds
    1368:20140118:223919.259 Zabbix agent item "vfs.fs.size[/,pfree]" on host "www1" failed: another network error, wait for 15 seconds
    1368:20140118:223934.267 temporarily disabling Zabbix agent checks on host "www1": host unavailable

    My assumption of behavior is:
    You tell zabbix server some information about hosts expected:
    I have hosts named www1,www2,www3 listen for them and match exact on name. When you find them apply the template for Linux OS to them, monitor the ram, swap, cpu, IO wait etc... (which all works fine on local zabbix agent on the zabbix server). You tell a Zabbix Agent on one of the WWW# boxes to start talking active updates to the server, it talks to server says "yo I am here, gimme the payload of what you want me to monitor", next time it phones home to the Zabbix server (based on my config, 60 seconds) it phones home with the values the server is looking to monitor.

    I would like to get this working manually so I can use the zabbix API to do this automatically when I provision or destroy a node. I have a script which interacts with EC2 and my deployment automation system to make a VM and provision the box to working order, so this if it works should be an easy addition.
  • EnigmA-X
    Senior Member
    Zabbix Certified Specialist
    • Oct 2010
    • 116

    #2
    First of all, you actually should configure your clients properly on the server. An ip of 0.0.0.0 is bogus.

    Also the zabbix server should be able to reach your clients, for example when you change one of your active checks, this will be updated by the server on the client.

    Active checks are not to overcome unreachable hosts, but to offload to clients and reduce connectivity.

    Then, what version of Zabbix are you using. What does the configuration file look like on your client. From your logfile, it looks like your active checks are being sent to the client itself, instead of the server.

    Comment

    • heinz
      Junior Member
      • Jan 2014
      • 2

      #3
      Unfortunately due to the nature of the cloud environments we are using not all the hosts have a public IP number nore should be reachable via the zabbix host (due to where it lives) but I would like it to feed stats out. I guess from reading posts and some online documents I built up the wrong impression of what the active checks can do for me.

      I am using version 2.2.1.

      I can list my configs later, I have since turned off the hosts and moved on to a different problem for the time being.

      Thank you for your reply, I greatly appreciate the feedback.

      Comment

      Working...