Ad Widget

Collapse

Required performance / Values processed

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Lennyroquai
    Junior Member
    • Aug 2014
    • 8

    #1

    Required performance / Values processed

    Hi,

    Recently i started having some strange thing with my Zabbix :
    I have a lot of "xxx failed: first network error, wait for 15 seconds"
    and a few seconds later : connection restored
    And, a small hole in graph.

    For the moment, only Agents and JMX are concerned (SNMP is working fine)

    I have Zabbix 2.4.1, one server acting as Zabbix server, and another one as database.

    When i check Zabbix Health, it says :
    - Required performance of Zabbix server, new values per second : 389.28
    - Values processed by Zabbix server per second : 326.47

    All internal processes are under 5%
    Gathering processes : under 25%
    CPUs < 15% (on DB and Server)

    I am monitoring : 180 virtual machines, 16 physical servers, 16 switchs, 8 routers, 32 Java servers, 8 PDUs....
    And my Zabbix_server.conf is heavily tuned (Cache, pollers, etc...)
    Of course, connectivity has been tested.

    And here's what logs look like (Debug level 3)


    Does someone has a clue about why this is happening ?

    Many thanks
    Last edited by Lennyroquai; 27-10-2014, 19:26.
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #2
    You can try to increment your Timeout= value on your Zabbix server in zabbix_server.conf if it is still at the default of 3. Maybe try 15 or 20 and then restart your Zabbix server process.

    I know you say your internal processes are good, but I would be interested in seeing the 3 default graphs which show all those procs, with a 1 day (24 hour) view, as per the bottom of this post: https://www.zabbix.com/forum/showthread.php?t=41219

    Comment

    • Lennyroquai
      Junior Member
      • Aug 2014
      • 8

      #3
      Hi tchjts1,

      Here's my zabbix.conf :
      StartPollers=80
      StartIPMIPollers=5
      StartPollersUnreachable=3
      StartTrappers=16
      StartPingers=12
      StartHTTPPollers=4
      StartTimers=4
      StartJavaPollers=6

      StartVMwareCollectors=4
      VMwareFrequency=30

      StartDBSyncers=4

      Timeout=30
      An here's the graph requested (sorry, i didn't see the post talking about it)







      Other infos :
      My queue seems to be clean : the only strange thing is that, sometimes 81 items seems to get stuck in the queue.
      If i restart the server, they dissapear... but then come back a few minutes later... and stays in the "More than 10 minutes" column.

      My issue seems to target only Windows machine (2.4.0 agent) and if i checked the log : it's always the same machines looping in it. (but they are on different network)
      I've restarted the agents yesterday, but nothing changed.
      I will install the new agent to see if it solve the issue.

      Thanks

      Comment

      • tchjts1
        Senior Member
        • May 2008
        • 1605

        #4
        Your processes all look fine.

        On the agent where you are seeing that issue, in zabbix_agentd.conf is also a Timeout= value. Try incrementing that and restart the agent.

        One other thing to check on that one - be sure you installed the correct agent - 32bit or 64bit.

        Comment

        • Lennyroquai
          Junior Member
          • Aug 2014
          • 8

          #5
          Hi,

          I've tried :
          - Set the Timeout Value to 30s
          - Upgrade the agent to 2.4.1
          - Downgrade the agent to 2.2.1

          Nothing new...
          Currently, the agents with have the issue are all 32 Bits, (Windows 2003 Virtual machines)

          Comment

          • Lennyroquai
            Junior Member
            • Aug 2014
            • 8

            #6
            Damn... I finally found what was wrong...

            The interval of an item, in my "Windows Template" was set to 1.

            I don't know why and when this happened, but once a set it back to 60 : The issue stopped !



            Anyway, thanks for the help.

            Comment

            Working...