Ad Widget

Collapse

Флаппинг подключения к заббиксу

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • WeSTMan
    Junior Member
    • Nov 2019
    • 20

    #1

    Флаппинг подключения к заббиксу

    В логах заббикса постоянно происходит такая ерунда
    Code:
    112056:20191217:041552.172 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191217:041552.172 active check data upload to [10.190.16.15:10051] is working again
    112056:20191218:011808.180 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191218:011908.022 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191218:024520.093 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191218:024520.108 active check data upload to [10.190.16.15:10051] is working again
    112056:20191218:221126.731 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191218:221126.746 active check data upload to [10.190.16.15:10051] is working again
    112056:20191219:021525.287 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191219:021625.098 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191219:040134.667 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191219:040134.683 active check data upload to [10.190.16.15:10051] is working again
    112056:20191220:041842.103 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191220:041942.959 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191221:015015.671 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191221:015015.687 active check data upload to [10.190.16.15:10051] is working again
    112056:20191221:023250.905 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191221:023250.920 active check data upload to [10.190.16.15:10051] is working again
    112056:20191223:035814.541 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191223:035914.445 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191223:175525.057 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191223:175625.866 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191226:013445.286 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191226:013445.301 active check data upload to [10.190.16.15:10051] is working again
    112056:20191226:015259.470 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191226:015359.311 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191226:041203.551 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191226:041303.330 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191227:214316.207 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191227:214316.223 active check data upload to [10.190.16.15:10051] is working again
    112056:20191228:011723.556 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191228:011823.413 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191228:022729.862 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191228:022729.877 active check data upload to [10.190.16.15:10051] is working again
    112056:20191228:023622.352 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191228:023622.368 active check data upload to [10.190.16.15:10051] is working again
    112056:20191228:060429.705 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191228:060529.531 active check configuration update from [10.190.16.15:10051] is working again
    112056:20191231:030811.695 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20191231:030911.817 active check configuration update from [10.190.16.15:10051] is working again
    112056:20200101:023129.744 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200101:023129.759 active check data upload to [10.190.16.15:10051] is working again
    112056:20200104:015521.247 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200104:015521.263 active check data upload to [10.190.16.15:10051] is working again
    112056:20200104:200029.307 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: A connection timeout occurred.)
    112056:20200104:200836.511 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200109:083001.201 active check data upload to [10.190.16.15:10051] is working again
    112056:20200109:083001.482 active check configuration update from [10.190.16.15:10051] is working again
    112056:20200109:084204.963 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200109:084207.974 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200109:091511.024 active check data upload to [10.190.16.15:10051] is working again
    112056:20200109:091544.440 active check configuration update from [10.190.16.15:10051] is working again
    112056:20200109:220940.123 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200109:220940.139 active check data upload to [10.190.16.15:10051] is working again
    112056:20200109:221444.463 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200109:221444.479 active check data upload to [10.190.16.15:10051] is working again
    112056:20200111:003646.392 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200111:003646.408 active check data upload to [10.190.16.15:10051] is working again
    112056:20200111:235857.687 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200111:235858.701 active check data upload to [10.190.16.15:10051] is working again
    112056:20200113:011423.619 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200113:011423.635 active check data upload to [10.190.16.15:10051] is working again
    112056:20200115:051412.048 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200115:051412.063 active check data upload to [10.190.16.15:10051] is working again
    112056:20200115:140514.127 active check configuration update from [10.190.16.15:10051] started to fail (cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200115:140517.154 active check data upload to [10.190.16.15:10051] started to fail ([connect] cannot connect to [[10.190.16.15]:10051]: (null))
    112056:20200115:141612.588 active check data upload to [10.190.16.15:10051] is working again
    Сервер - Кластер БД MS SQL (много подключений).

    Может кто сталкивался? Или есть какие-нибудь советы?
    На сервере 3 IP адреса. Только через 1 есть доступ к заббиксу. Доступ открыт
    Last edited by WeSTMan; 17-01-2020, 09:40.
  • Hamardaban
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • May 2019
    • 2713

    #2
    Глобальный совет один - использовать для управления\контроля сеть не связанную с передачей данных.
    Может сервер с SQL перенагружен.... Наверно и с производительностью sql затыки?

    Comment

    • Alex_UUU
      Senior Member
      • Dec 2018
      • 541

      #3
      А что выдает
      Code:
       
       nmap 10.190.16.15 -p10051

      Comment

      • Kos
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Aug 2015
        • 3404

        #4
        Я правильно понял ситуацию, что процитированный фрагмент - это кусок лога Zabbix-агента, который работает на сервере MS SQL, имеющем несколько сетевых интерфейсов, причём связаться с сервером Zabbix можно только через один из них?
        Если так, то в конфиге этого Zabbix-агента желательно явно выставить параметр "SourceIP=" со ссылкой на IP-адрес нужного интерфейса, после чего перезапустить агента.

        Comment

        • WeSTMan
          Junior Member
          • Nov 2019
          • 20

          #5
          Originally posted by Hamardaban
          Глобальный совет один - использовать для управления\контроля сеть не связанную с передачей данных.
          Может сервер с SQL перенагружен.... Наверно и с производительностью sql затыки?
          SQL сервер не нагружен

          Comment

          • WeSTMan
            Junior Member
            • Nov 2019
            • 20

            #6
            Originally posted by Alex_UUU
            А что выдает
            Code:
            nmap 10.190.16.15 -p10051
            Доступ есть

            Comment

            • WeSTMan
              Junior Member
              • Nov 2019
              • 20

              #7
              Originally posted by Kos
              Я правильно понял ситуацию, что процитированный фрагмент - это кусок лога Zabbix-агента, который работает на сервере MS SQL, имеющем несколько сетевых интерфейсов, причём связаться с сервером Zabbix можно только через один из них?
              Если так, то в конфиге этого Zabbix-агента желательно явно выставить параметр "SourceIP=" со ссылкой на IP-адрес нужного интерфейса, после чего перезапустить агента.
              Вы правы. Только SourceIP был уже указан. Другим интерфейсам не доступно подключение к заббикс серверу. Проблема флаппинга пропадает после перезагрузки, но в лог все равно сыпит те же самые ошибки.

              Comment

              • WeSTMan
                Junior Member
                • Nov 2019
                • 20

                #8
                Еще немного информации.. Забикс сервер 4.0.13. Заббикс агент 3.0.4

                Comment

                Working...