Ad Widget

Collapse

Zabbix 4.0.3 'first network error, wait for 15 seconds'

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Florius
    Junior Member
    • Jan 2017
    • 15

    #1

    Zabbix 4.0.3 'first network error, wait for 15 seconds'

    Hi,

    I have a Zabbix server with 18 hosts, and 5000 items.
    I run Zabbix Server and Agent on all my hosts with version 4.0.3 on Debian 9.
    My Zabbix Server config:

    LogFile=/var/log/zabbix/zabbix_server.log
    LogFileSize=100
    PidFile=/var/run/zabbix/zabbix_server.pid
    SocketDir=/var/run/zabbix
    DBName=zabbix
    DBUser=zabbix
    StartPollers=20
    StartPollersUnreachable=3
    StartPingers=3
    SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
    CacheSize=32M
    Timeout=20
    AlertScriptsPath=/usr/lib/zabbix/alertscripts
    ExternalScripts=/usr/lib/zabbix/externalscripts
    FpingLocation=/usr/bin/fping
    Fping6Location=/usr/bin/fping6
    LogSlowQueries=3000
    AllowRoot=1


    Since 2 days around 02:00 AM I have 3 hosts which are flapping with "Zabbix Agent is unreachable for 5 minutes".
    I didn't change anything that day before, and on the same location I have 10 more hosts, which are working perfectly fine.

    I noticed that my Zabbix server log is filled with:
    "failed: first network error, wait for 15 seconds" and seconds later, it says "connection restored".
    The items that timeout are completely random.

    I have no ping loss for over 4000 pings, both to and from the Zabbix Server.
    I restarted Zabbix server, Zabbix agent, I rebooted several times and removed and added the hosts to Zabbix Server again.
    There is no resource shortage on Zabbix Server, no high load or memory usage.
    All my pollers seem okay, no shortage of any.

    Any and all help is appreciated, I've been banging my head at this for 2 days now, with no result.

  • dimir
    Zabbix developer
    • Apr 2011
    • 1080

    #2
    Do you think increasing StartAgents on the agent could help?

    Comment

    • Florius
      Junior Member
      • Jan 2017
      • 15

      #3
      Originally posted by dimir
      Do you think increasing StartAgents on the agent could help?
      Thought the same as well, I tried that, but no luck. The agent isn't busy either if I do a `watch -tn 0.2 'ps -fC zabbix_agentd'`
      Also I have hosts with more items then this host.

      Thank you!

      Comment

      • dimir
        Zabbix developer
        • Apr 2011
        • 1080

        #4
        Those failing checks, are they UserParameters? Perhaps they take more than Timeout now to finish?

        Comment

        • Florius
          Junior Member
          • Jan 2017
          • 15

          #5
          Originally posted by dimir
          Those failing checks, are they UserParameters? Perhaps they take more than Timeout now to finish?
          Some of them are custom, but do not execute scripts, or anything heavy. Most of them are greps, which are completed very fast, if not instant.

          EDIT: I created a while loop script, which runs zabbix_get every 5 seconds.
          Every now and then it gives "zabbix_get [29101]: Timeout while executing operation".

          Any idea how to troubleshoot that?
          Last edited by Florius; 01-02-2019, 17:30.

          Comment

          Working...