Ad Widget

Collapse

Monitoring with proxy broke

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nocturn
    Junior Member
    • Nov 2008
    • 16

    #1

    Monitoring with proxy broke

    I'm monitoring remote systems via a Zabbix proxy. This proxy runs on one remote machine that also has it's own server.

    Since I upgraded both the server and that system to 1.8.3, the server lists the machine as unreachable despite the proxy being alive in the admin > DM pages.

    If I change the setup and monitor the machine directly (without proxy), everything works OK (though I rather not do that).

    I did not change the firewall and the agent is running on the remote machine.
    What else can I do or check?
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #2
    So, even with the host showing as unreachable (Agent function) are you still receiving data via the proxy for servers assigned to it (Proxy function)?

    If that's the case, focus on the agent. My rotuine in this case is to check the agentd and server logs. If nothing is evident there, then try the most simple troubleshooting methods first - restarting the agent, disabling, then enabling the host in the frontend.

    If that doesn't work, I delete the host from the frontend and re-enter it again. You lose your historical data for that host though, so caution there.

    If that doesn't work, I re-install the agent to the host.

    If you are not receiving any data via the proxy, I would apply the same basic troubleshooting, but also include steps for re-starting MySql on the proxy.

    I have some hosts that simply show unreachable no matter what I do, but they still pass data, so I have learned to tolerate it.

    Comment

    • nocturn
      Junior Member
      • Nov 2008
      • 16

      #3
      Strange thing

      I'm trying to remove and add it back now.
      The strange thing is that under configuration > hosts, it showed up with a higlighted Z from Agent monitoring (while it's grayed out initially after re-adding)...

      Also, the agent and proxy logs show normal entries as if it's being monitorred...

      Comment

      • tchjts1
        Senior Member
        • May 2008
        • 1605

        #4
        Originally posted by nocturn
        it showed up with a higlighted Z from Agent monitoring (while it's grayed out initially after re-adding)...
        That's normal while it tries to determine whether it is reachable or not.

        Comment

        • nocturn
          Junior Member
          • Nov 2008
          • 16

          #5
          Hmm

          Strange, the other hosts behind the proxy are not using the agent but just port checks.

          These checks all show OK, but have not been updated for a long time...
          So I guess it really is the proxy.

          Comment

          • tchjts1
            Senior Member
            • May 2008
            • 1605

            #6
            Personally, I would re-start the proxy as well as MySql on the proxy box, then look at the zabbix_proxy.log

            Comment

            • nocturn
              Junior Member
              • Nov 2008
              • 16

              #7
              It shows the Z again, but no data is coming in...
              I already rebooted both the zabbix and the proxy server without result...

              Comment

              • tchjts1
                Senior Member
                • May 2008
                • 1605

                #8
                Time to start looking at logs.

                On the Zabbix proxy: zabbix_proxy.log
                On the Zabbix server: zabbix_server.log

                Comment

                • nocturn
                  Junior Member
                  • Nov 2008
                  • 16

                  #9
                  Originally posted by tchjts1
                  Time to start looking at logs.

                  On the Zabbix proxy: zabbix_proxy.log
                  On the Zabbix server: zabbix_server.log
                  I'm tailing them, but the only interesting messages are:
                  4066:20100830:183242.640 Sending configuration data to proxy 'zproxy'. Datalen 10412
                  in the server log

                  and
                  3029:20100830:192340.045 Sending list of active checks to [x.x.x.x] failed: host [chaos] not found

                  I'm not using any active checks.

                  Is there anything specific I can look for?

                  Comment

                  • nocturn
                    Junior Member
                    • Nov 2008
                    • 16

                    #10
                    Strange

                    In Administration > Queue, it shows the details for the proxy. The simple checks to the other hosts are delayed by 6 months...

                    Comment

                    • tchjts1
                      Senior Member
                      • May 2008
                      • 1605

                      #11
                      Don't worry about the "... host not found" message if you are not using active checks. I wish Zabbix would do away with that message when using passive checks.

                      Just FYI, you can get rid of that by entering a new line in your zabbix_agentd.conf file of: DisableActive=1 and restarting the agent(s). But, that is not the issue.

                      Anyway... stepping back to the start -

                      1. You upgraded Zabbix server from 1.8.2 to 1.8.3
                      2. You upgraded Zabbix proxy from 1.8.2 to 1.8.3
                      3. Doing a direct connect from host to Zabbix server works
                      4. Doing a connect from host --> Proxy --> Zabbix server no longer works for any of your monitored hosts that route via the proxy.
                      5. No errors are indicated in zabbix_proxy or zabbix_server logs.
                      6. Zabbix GUI indicates "Proxy last seen" in a reasonable amount of time?
                      7. You can ping Zabbix server from Zabbix proxy?
                      8. ps -ef|grep zabbix on the proxy shows the proxy process running multiple instances?

                      My next step would be to go into /etc/zabbix/zabbix_proxy.conf and bump up the Debug level temporarily from 3 to 4. Stop the zabbix_proxy process, mv the existing zabbix_proxy log so you have a fresh one at the restart. Restart the zabbix_proxy process and check the log immediately after the re-start. Don't tail it, cat it. You want to see the start of the log and the first 20 lines or so.

                      While you are in the zabbix_proxy.conf file, just go through it to make sure everything looks correct.

                      Comment

                      • nocturn
                        Junior Member
                        • Nov 2008
                        • 16

                        #12
                        Originally posted by tchjts1
                        6. Zabbix GUI indicates "Proxy last seen" in a reasonable amount of time?
                        7. You can ping Zabbix server from Zabbix proxy?
                        8. ps -ef|grep zabbix on the proxy shows the proxy process running multiple instances?
                        Yes to all, I will try the debug step

                        Comment

                        • nocturn
                          Junior Member
                          • Nov 2008
                          • 16

                          #13
                          the log

                          Code:
                            3756:20100830:200206.686 Starting Zabbix Proxy. Zabbix 1.8.3 (revision 13928).
                            3756:20100830:200206.686 **** Enabled features ****
                            3756:20100830:200206.686 SNMP monitoring:        NO
                            3756:20100830:200206.686 IPMI monitoring:        NO
                            3756:20100830:200206.686 WEB monitoring:        YES
                            3756:20100830:200206.686 ODBC:                   NO
                            3756:20100830:200206.686 SSH2 support:           NO
                            3756:20100830:200206.686 IPv6 support:           NO
                            3756:20100830:200206.686 **************************
                            3756:20100830:200206.686 In init_database_cache()
                            3756:20100830:200206.686 In zbx_mem_required_size(): size[8421024] chunks_num[4] descr[history cache] param[HistoryCacheSize]
                            3756:20100830:200206.687 End of zbx_mem_required_size(): size[8421519]
                            3756:20100830:200206.687 In zbx_mem_create(): descr[history cache] param[HistoryCacheSize] size[8421519]
                            3756:20100830:200206.687 valid user addresses: [0x7f56c55bf160, 0x7f56c5dc7088) total size: [8421160]
                            3756:20100830:200206.687 End of zbx_mem_create()
                            3756:20100830:200206.687 In zbx_mem_malloc(): size[160]
                            3756:20100830:200206.687 End of zbx_mem_malloc()
                            3756:20100830:200206.687 In zbx_mem_malloc(): size[8388608]
                            3756:20100830:200206.687 End of zbx_mem_malloc()
                            3756:20100830:200206.687 In zbx_mem_malloc(): size[32000]
                            3756:20100830:200206.687 End of zbx_mem_malloc()
                            3756:20100830:200206.687 In zbx_mem_malloc(): size[256]
                            3756:20100830:200206.687 End of zbx_mem_malloc()
                            3756:20100830:200206.687 In zbx_mem_required_size(): size[16777216] chunks_num[1] descr[history text cache] param[HistoryTextCacheSize]
                            3756:20100830:200206.687 End of zbx_mem_required_size(): size[16777627]
                            3756:20100830:200206.687 In zbx_mem_create(): descr[history text cache] param[HistoryTextCacheSize] size[16777627]
                            3756:20100830:200206.687 valid user addresses: [0x7f56c45be168, 0x7f56c55be190) total size: [16777256]
                            3756:20100830:200206.687 End of zbx_mem_create()
                            3756:20100830:200206.687 In zbx_mem_malloc(): size[16777216]
                            3756:20100830:200206.687 End of zbx_mem_malloc()
                            3756:20100830:200206.687 In zbx_mem_required_size(): size[4194304] chunks_num[1] descr[trend cache] param[TrendCacheSize]
                            3756:20100830:200206.687 End of zbx_mem_required_size(): size[4194702]
                            3756:20100830:200206.687 In zbx_mem_create(): descr[trend cache] param[TrendCacheSize] size[4194702]
                            3756:20100830:200206.687 valid user addresses: [0x7f56c41bd158, 0x7f56c45bd188) total size: [4194352]
                            3756:20100830:200206.687 End of zbx_mem_create()
                            3756:20100830:200206.687 In zbx_hashset_create()
                            3756:20100830:200206.687 In zbx_mem_malloc(): size[8072]
                            3756:20100830:200206.687 End of zbx_mem_malloc()
                            3756:20100830:200206.687 End of zbx_hashset_create()
                            3756:20100830:200206.687 End of init_database_cache()
                            3756:20100830:200206.687 In init_configuration_cache() size:8388608
                            3756:20100830:200206.687 In zbx_mem_create(): descr[configuration cache] param[CacheSize] size[7130317]
                            3756:20100830:200206.687 valid user addresses: [0x7f56c3af0160, 0x7f56c41bccc8) total size: [7129960]
                            3756:20100830:200206.687 End of zbx_mem_create()
                            3756:20100830:200206.688 In zbx_mem_malloc(): size[1120]
                            3756:20100830:200206.688 End of zbx_mem_malloc()
                            3756:20100830:200206.688 In zbx_hashset_create()
                            3756:20100830:200206.688 In zbx_mem_malloc(): size[8072]

                          Comment

                          • nocturn
                            Junior Member
                            • Nov 2008
                            • 16

                            #14
                            Checks

                            I actually see checks it got from the server like this:

                            Code:
                              3756:20100830:200206.931 In zbx_strpool_intern() str[proc.num[zarafa-gateway]]

                            Comment

                            • tchjts1
                              Senior Member
                              • May 2008
                              • 1605

                              #15
                              Yeah, I am a bit lost on what advice to give you next. I haven't played with 1.8.3 too much yet, but I do know they introduced a new mode for the proxy "passive". Take a look at Administration --> DM and the second column over for the proxy is "Mode". Mine is set to "Active" on my 1.8.3 setup and all is working fine.

                              Outside of what I have listed in the previous steps, I don't know what else to tell you on this. Maybe someone else has a thought or maybe a dev can chime in with some steps to take.

                              Comment

                              Working...