I have a zabbix server, configured with proxies and monitoring about 100+ hosts.
Each day (or night) 4 to 5 agents stop working, the log don't show anything suspicious (currently in level 4).
If i manually restart the agent, everything work fine (until it die again another day).
This is the last chunk of one agent log:
8858:20190322:075841.283 In zbx_execute_threaded_metric() key:'vfs.fs.size'
4901:20190322:075841.284 executing in data process for key:'vfs.fs.size'
8858:20190322:075841.285 End of zbx_execute_threaded_metric():SYSINFO_SUCCEED ''
8858:20190322:075841.285 Sending back [87549898752]
8858:20190322:075841.285 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
8856:20190322:075841.349 __zbx_zbx_setproctitle() title:'collector [processing data]'
8856:20190322:075841.349 In update_cpustats()
8856:20190322:075841.349 End of update_cpustats()
8856:20190322:075841.349 __zbx_zbx_setproctitle() title:'collector [idle 1 sec]'
[END]
Agent version: 4.0.5-6.1
Server version: 4.0.5-1.el7.x86_64
Proxy version: 4.0.5-1.el7.x86_64
As shown the last entry in the log are 07:58:41, the server fired the trigger for the dead agent at 08:03
The changes in default zabbix-agentd.conf:
Server=PROXY_ADDRESS
ServerActive=PROXY_ADDRESS
EnableRemoteCommands=1
RefreshActiveChecks=60
Timeout=30
AllowRoot=1
DebugLevel=4
Each day (or night) 4 to 5 agents stop working, the log don't show anything suspicious (currently in level 4).
If i manually restart the agent, everything work fine (until it die again another day).
This is the last chunk of one agent log:
8858:20190322:075841.283 In zbx_execute_threaded_metric() key:'vfs.fs.size'
4901:20190322:075841.284 executing in data process for key:'vfs.fs.size'
8858:20190322:075841.285 End of zbx_execute_threaded_metric():SYSINFO_SUCCEED ''
8858:20190322:075841.285 Sending back [87549898752]
8858:20190322:075841.285 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
8856:20190322:075841.349 __zbx_zbx_setproctitle() title:'collector [processing data]'
8856:20190322:075841.349 In update_cpustats()
8856:20190322:075841.349 End of update_cpustats()
8856:20190322:075841.349 __zbx_zbx_setproctitle() title:'collector [idle 1 sec]'
[END]
Agent version: 4.0.5-6.1
Server version: 4.0.5-1.el7.x86_64
Proxy version: 4.0.5-1.el7.x86_64
As shown the last entry in the log are 07:58:41, the server fired the trigger for the dead agent at 08:03
The changes in default zabbix-agentd.conf:
Server=PROXY_ADDRESS
ServerActive=PROXY_ADDRESS
EnableRemoteCommands=1
RefreshActiveChecks=60
Timeout=30
AllowRoot=1
DebugLevel=4