It would be useful to make Zabbix Agent immune to the Linux oom_killer, as it would allow the agent to keep collecting and sending data to the server, even when the monitored hosts are running out of memory. Are you agree with my opinion?
Ad Widget
Collapse
Make Zabbix Agent immune to OOM killer. Is this a good idea?
Collapse
X
-
for the record a couple of reasons why this should not be default :
a) for most users, getting their precious production software killed while agent runs would be a bad thing. better to get the agent killed and get notified about that than interrupt Important Process;
b) if memleak would appear in the agent (not very likely, but still), that would be disastrous... -
Not all users set high severity for the agent status, correspondingly may not know that the server has a problems.
Agent uses relatively little memory, correspondingly stop it will not solve the problem.
The availability of information about what is happening on the server can help prevent/find the cause of the issue.
How to answer to the big boss's question - "Why?" or "What's wrong with the server/application?" if the agent is not working?Comment
-
Only deep analisys of system/application logs and zabbix data can answer to the question. You can exclude zabbix agent from oom_killer by set:How to answer to the big boss's question - "Why?" or "What's wrong with the server/application?" if the agent is not working?
But this will not give you a guarantee against zabbix crash. Actually most of critical systems are setted to immediate reboot when oom_killer arises. So it's better to understand how important to have system under OOM or immediate reboot is the right way.Code:echo "-17" >/proc/<zabbix_pids>/oom_adj
By relaying on agent status only you can hit wrong situation. There should be a technique to allocate the problem for each information system.>Not all users set high severity for the agent status, correspondingly may not know that the server has a problems.
So, in general, it makes no sense.Comment
Comment