Hello,
I have a strange problem with an active agent, that was working fine for some weeks. It has been configured with PSK encryption.
Since this morning it became "not available" in the GUI and somehow I cannot fix this.
The funny part is, the agent is running and I can query it with zabbix_get (e.g. zabbix_get -s **redacted** -k "system.cpu.load[all,avg1]" --tls-connect=psk --tls-psk-identity="ZBX-AGENT-PSK-ID" --tls-psk-file=/etc/zabbix/zabbix_agent.psk ) and it returns the correct values.
If I raise the DebugLevel for the agent I can see successfull TLS connects:
62932:20210614:105359.617 In zbx_tls_connect(): psk_identity:"ZBX-AGENT-PSK-ID"
62932:20210614:105359.617 zbx_psk_client_cb() requested PSK identity "ZBX-AGENT-PSK-ID"
62932:20210614:105359.618 End of zbx_tls_connect():SUCCEED (established TLSv1.2 PSK-AES128-CBC-SHA)
62932:20210614:105359.618 JSON before sending [{"request":"agent data","session":"4995aa78ba164fc7cddac4a2a6c7d6c0" ,"data":[{"host":"**redacted**","key":"net.if.in["eth0",dropped]","value":"169","id":1,"clock":1623660829,"ns" :605 890716}],"clock":1623660839,"ns":618254293}]
62932:20210614:105359.618 JSON back [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000034"}]
62932:20210614:105359.618 In check_response() response:'{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000034"}'
62932:20210614:105359.618 info from server: 'processed: 1; failed: 0; total: 1; seconds spent: 0.000034'
62932:20210614:105359.618 End of check_response():SUCCEED
62932:20210614:105359.618 OK
62932:20210614:105359.618 End of send_buffer():SUCCEED
On the Zabbix server I can also see the items that are not supported or become available:
1960140:20210614:104819.284 item "**redacted**:vfs.dev.read.await[sda]" became supported
1960140:20210614:104821.290 item "**redacted**:vfs.dev.write.await[sda]" became supported
So I guess the communication basically is OK and working.
The Zabbix GUI shows a red availability symbol and the tooltip reads "Get value from agent failed: SSL_read() timed out"
Deleting the host and setting it up again leads to the same problem. Re-installing the agent did not help as well.
I am just puzzled how this can happen after weeks of it working flawlessly.... Last week I installed some updates on the servers, but no SSL package has been touched.
Can someone give me a hint how I can debug this problem further?
Thanks!
Stefan
I have a strange problem with an active agent, that was working fine for some weeks. It has been configured with PSK encryption.
Since this morning it became "not available" in the GUI and somehow I cannot fix this.
The funny part is, the agent is running and I can query it with zabbix_get (e.g. zabbix_get -s **redacted** -k "system.cpu.load[all,avg1]" --tls-connect=psk --tls-psk-identity="ZBX-AGENT-PSK-ID" --tls-psk-file=/etc/zabbix/zabbix_agent.psk ) and it returns the correct values.
If I raise the DebugLevel for the agent I can see successfull TLS connects:
62932:20210614:105359.617 In zbx_tls_connect(): psk_identity:"ZBX-AGENT-PSK-ID"
62932:20210614:105359.617 zbx_psk_client_cb() requested PSK identity "ZBX-AGENT-PSK-ID"
62932:20210614:105359.618 End of zbx_tls_connect():SUCCEED (established TLSv1.2 PSK-AES128-CBC-SHA)
62932:20210614:105359.618 JSON before sending [{"request":"agent data","session":"4995aa78ba164fc7cddac4a2a6c7d6c0" ,"data":[{"host":"**redacted**","key":"net.if.in["eth0",dropped]","value":"169","id":1,"clock":1623660829,"ns" :605 890716}],"clock":1623660839,"ns":618254293}]
62932:20210614:105359.618 JSON back [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000034"}]
62932:20210614:105359.618 In check_response() response:'{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000034"}'
62932:20210614:105359.618 info from server: 'processed: 1; failed: 0; total: 1; seconds spent: 0.000034'
62932:20210614:105359.618 End of check_response():SUCCEED
62932:20210614:105359.618 OK
62932:20210614:105359.618 End of send_buffer():SUCCEED
On the Zabbix server I can also see the items that are not supported or become available:
1960140:20210614:104819.284 item "**redacted**:vfs.dev.read.await[sda]" became supported
1960140:20210614:104821.290 item "**redacted**:vfs.dev.write.await[sda]" became supported
So I guess the communication basically is OK and working.
The Zabbix GUI shows a red availability symbol and the tooltip reads "Get value from agent failed: SSL_read() timed out"
Deleting the host and setting it up again leads to the same problem. Re-installing the agent did not help as well.
I am just puzzled how this can happen after weeks of it working flawlessly.... Last week I installed some updates on the servers, but no SSL package has been touched.
Can someone give me a hint how I can debug this problem further?
Thanks!
Stefan
Comment