Ad Widget

Collapse

Problem with proxy to server communication

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • yulian
    Junior Member
    • Jan 2020
    • 3

    #1

    Problem with proxy to server communication

    Hi all,
    my name is Stefan and i'm running a large Zabbix 3.4 installation. I have one server for postgres db, one for zabbix server and web frontend and 9 proxys.
    Each proxy is behind a firewall and and communicate through that with the server in one way (proxy -> server on port 10051).
    Each proxy is used for 10-100 clients. All is up and running.

    But in zabbix_proxy.log i see such lines.

    31274:20200124:144613.134 Unable to connect to the server [10.10.10.2]:10051 [cannot connect to [[10.10.10.2]:10051]: [110] Connection timed out]. Will retry every 5 second(s)
    31274:20200124:144613.138 Connection restored.


    This happens all 3 minutes. But the proxy works. He received configuration data from server and he is collecting data over snmpv3 from switches.
    Can anyone tell me what's wrong? The log is full of that lines in any of my proxys.

    Thank you.
  • Markku
    Senior Member
    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
    • Sep 2018
    • 1781

    #2
    Hi Stefan, can you please rephrase this:

    This happens all 3 minutes.
    Markku

    Comment

    • yulian
      Junior Member
      • Jan 2020
      • 3

      #3
      Originally posted by Markku
      Hi Stefan, can you please rephrase this:



      Markku
      Hi Markku,
      thanks for your reply and sorry for my terrible english.

      What i mean is that repeated in 3 minute steps such a log-entry is shown in the logfile.

      Comment

      • Markku
        Senior Member
        Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
        • Sep 2018
        • 1781

        #4
        Hi, no problem, thanks for the clarification. Based on the available data it looks like a network issue in the Zabbix server side, or a problem in the Zabbix server itself (because all proxies suffer the same problem connecting to the server). Have you checked that there are no load-based issues in the server?

        Are the problems occurring at the exactly same timestamp on each proxy? (NTP time syncing is a must in monitoring systems anyway so it's good to check those settings as well.)

        If needed, maybe you can run tcpdump on the server side to capture and see the actual packets that are received from the proxies. That would be something like "sudo tcpdump -w proxycapture.pcap host <proxy-ip> and port 10051" (for one proxy). You can also do the same on the proxy ("host <zabbix-server-ip> and port 10051") at the same time if needed. You can then open the pcap files with Wireshark and check what happened in the packet level.

        Also, maybe you could run some other traffic like ping at the same time to see if the pings fail at the same time with the proxy traffic.

        Markku

        Comment

        • Markku
          Senior Member
          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
          • Sep 2018
          • 1781

          #5
          FYI, the proxy always buffers the gathered and timestamped data, by default for one hour at most, if it cannot send it to the server. That is why you don't see any gaps in the graphs in Zabbix frontend later. You can see the timer settings in https://www.zabbix.com/documentation...g/zabbix_proxy .

          Markku

          Comment

          • yulian
            Junior Member
            • Jan 2020
            • 3

            #6
            Originally posted by Markku
            Hi, no problem, thanks for the clarification. Based on the available data it looks like a network issue in the Zabbix server side, or a problem in the Zabbix server itself (because all proxies suffer the same problem connecting to the server). Have you checked that there are no load-based issues in the server?

            Are the problems occurring at the exactly same timestamp on each proxy? (NTP time syncing is a must in monitoring systems anyway so it's good to check those settings as well.)

            If needed, maybe you can run tcpdump on the server side to capture and see the actual packets that are received from the proxies. That would be something like "sudo tcpdump -w proxycapture.pcap host <proxy-ip> and port 10051" (for one proxy). You can also do the same on the proxy ("host <zabbix-server-ip> and port 10051") at the same time if needed. You can then open the pcap files with Wireshark and check what happened in the packet level.

            Also, maybe you could run some other traffic like ping at the same time to see if the pings fail at the same time with the proxy traffic.

            Markku
            Hi Markku,

            on all proxys NTP is running and working. By the way, all proxys are also NTP-Servers for there subnets and time is checked with zabbix against zabbix-server time. So i think correct time is not the problem.
            The proxy cache mechanism is well knowned but thanks for clarification.

            Is it possible that a DataSenderFrequency of 5 seconds is to low?
            What i noticed is that connection timeout and connection restored timestamp not differ (only in milliseconds).
            I expect 5 seconds of difference because of that entry: Will retry every 5 second(s)

            Comment

            • Markku
              Senior Member
              Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
              • Sep 2018
              • 1781

              #7
              What i noticed is that connection timeout and connection restored timestamp not differ (only in milliseconds).
              I don't know what to make of that. Maybe someone else can chime in?

              But anyway, you didn't answer if the problems occur at the same time on each proxy.

              Have you checked that there are no load-based issues in the server?

              If you are still unsure whether the problem is in the Zabbix server connection processing or in the network in general, maybe you could run some other traffic like ping at the same time to see if the pings fail at the same time with the proxy traffic. Also, the traffic capturing will give you more information about the problem.

              If you have fellow colleagues that can give more information from the firewall and/or network point of view, what could be wrong every three minutes, that would be useful for you.

              Markku

              Comment

              • LenR
                Senior Member
                • Sep 2009
                • 1005

                #8
                We had some problems in mid 3.4.x versions with bad performance getting data written to the zabbix db, it may have been this issue https://support.zabbix.com/browse/ZBX-13343

                Comment

                Working...