Ad Widget

Collapse

Agent Unreachable Problem

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mellis
    Senior Member
    • Oct 2017
    • 145

    #1

    Agent Unreachable Problem

    This is an ongoing problem that I have not been able to correct for several years. I am to the point that I want to walk away from Zabbix. It just pops up and I do not have a handle on why it does, and then out of the blue it goes away for a couple days or weeks then comes back. Currently it is showing up in one install.
    I am running this Zabbix 4.2.5 as a VMWare 5.5 VM with 20GB Ram, and 500GB disk. 6 cores on a HP 360G6, 2.4Ghz


    As you can see almost every Agent is unreachable



    One thing I find odd is the latest data show a Up(1) which I thought would be ok?

    My item is setup to poll every 5 mins

    In addition, my Trigger is setup for 30mins
    {Template App Zabbix Agent:agent.ping.nodata(30m)}=1


    I setup a ICMP ping for these same workstations and it is not alerting at all.

    These are my zabbix_server.conf values
    StartPollers=96
    StartPreprocessors=10
    StartPollersunreachable=24
    StartTrappers=20
    StartHTTPPollers=2
    StartTimers=16
    StartEscalators=10
    HousekeepingFrequency=2
    HouseKeeperDelete=100000
    CacheSize=84M
    HistoryCacheSize=1536M
    HistoryIndexCacheSize=256M
    TrendCacheSize=128M
    ValueCacheSize=128M
    Timeout=30





    Attached Files
  • mellis
    Senior Member
    • Oct 2017
    • 145

    #2
    Looking at the latest data screen the time stamps are about ~15min behind on the CPU Processor Load checks



    I have reviewed the NTP setting and the Zabbix Server is getting it's time from the Windows Domain controllers and running date it does have the same date as the windows systems.

    I have been looking in the logs, really not seeing anything that stands out, but not realty sure what to look for.

    Comment

    • mellis
      Senior Member
      • Oct 2017
      • 145

      #3
      It looks like the windows time server is up and running and using the same source as the zabbix server, Offend I do see breaks in the graphs at the same time as the unreachable triggers start up. It really seems like the graphs break a min or two before the triggers kick off.
      In the queue it backs up to over 500 items in the 10min column.

      I did disable all the workstations that seem to be the problem and it will recover, adding them back in little at a time and the problem seems to return when I get a little over 200 host enable.

      Comment

      • Markku
        Senior Member
        Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
        • Sep 2018
        • 1782

        #4
        I see you are using passive agents, meaning that the Zabbix server will poll each of the agents to get the data. The recommended way is to use active agents (= ServerActive in agents and ”Zabbix agent (active)” in items). Is there a specific reason why you are not using active agent items?

        Markku

        Comment

        Working...