Ad Widget

Collapse

Make Zabbix Agent immune to OOM killer. Is this a good idea?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Heilig
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Mar 2013
    • 366

    #1

    Make Zabbix Agent immune to OOM killer. Is this a good idea?

    It would be useful to make Zabbix Agent immune to the Linux oom_killer, as it would allow the agent to keep collecting and sending data to the server, even when the monitored hosts are running out of memory. Are you agree with my opinion?
    6
    Yes
    50.00%
    3
    No
    33.33%
    2
    I don't care
    16.67%
    1
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    for the record a couple of reasons why this should not be default :

    a) for most users, getting their precious production software killed while agent runs would be a bad thing. better to get the agent killed and get notified about that than interrupt Important Process;
    b) if memleak would appear in the agent (not very likely, but still), that would be disastrous...
    Zabbix 3.0 Network Monitoring book

    Comment

    • volter
      Member
      Zabbix Certified Specialist
      • Dec 2011
      • 85

      #3
      Don't try to reinvent the wheel and use cgroups.

      Comment

      • Heilig
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Mar 2013
        • 366

        #4
        Not all users set high severity for the agent status, correspondingly may not know that the server has a problems.
        Agent uses relatively little memory, correspondingly stop it will not solve the problem.
        The availability of information about what is happening on the server can help prevent/find the cause of the issue.

        How to answer to the big boss's question - "Why?" or "What's wrong with the server/application?" if the agent is not working?

        Comment

        • BrandStorm
          Junior Member
          • May 2009
          • 7

          #5
          How to answer to the big boss's question - "Why?" or "What's wrong with the server/application?" if the agent is not working?
          Only deep analisys of system/application logs and zabbix data can answer to the question. You can exclude zabbix agent from oom_killer by set:
          Code:
          echo "-17" >/proc/<zabbix_pids>/oom_adj
          But this will not give you a guarantee against zabbix crash. Actually most of critical systems are setted to immediate reboot when oom_killer arises. So it's better to understand how important to have system under OOM or immediate reboot is the right way.

          >Not all users set high severity for the agent status, correspondingly may not know that the server has a problems.
          By relaying on agent status only you can hit wrong situation. There should be a technique to allocate the problem for each information system.

          So, in general, it makes no sense.

          Comment

          Working...