2022 Zabbix中国峰会
2022 Zabbix中国峰会

3 被动和主动代理检查

概述

本节提供关于Zabbix代理执行的被动和主动检查的详细信息。

Zabbix使用一个基于JSON的通信协议来与Zabbix代理进行通信。

这里有一些Zabbix使用的协议细节中的使用到的定义:

<HEADER> - "ZBXD\x01" (5 bytes)
       <DATALEN> - data length (8 bytes). 1 will be formatted as 01/00/00/00/00/00/00/00 (eight bytes in HEX, 64 bit number)

为了避免耗尽内存, 当Zabbix server使用 Zabbix protocol 协议时一次连接只接受128M。

被动检查

被动检查是一个简单的数据请求。Zabbix服务器或proxy请求一些数据(例如,CPU负载),Zabbix agent将结果发送回服务器。

Server 请求

<item key>\n

Agent 响应

<HEADER><DATALEN><DATA>[\0<ERROR>]

在上面,方括号中的部分是可选的,只发送到不受支持的项目。

例如, 对于支持的监控项:

  1. Server 打开一个TCP连接
  2. Server 发送 agent.ping\n
  3. Agent 读取请求并响应 <HEADER><DATALEN>1
  4. Server 处理数据以获取值, '1' in our case
  5. TCP连接关闭

对于不支持的监控项:

  1. Server 打开一个TCP连接
  2. Server 发送 vfs.fs.size[/nono]\n
  3. Agent 读取请求并响应 <HEADER><DATALEN>ZBX_NOTSUPPORTED\0Cannot obtain filesystem information: [2] No such file or directory
  4. Server 处理数据, 更改项目状态为不支持并显示指定的错误消息
  5. TCP连接关闭

主动检查

主动检查需要更复杂的处理,agent 必须首先从server端检索独立处理监控项的列表。

The servers 主动检查的列表在agent 配置文件中的 'ServerActive' 参数中列出, 请求这些检查的频率是由相同配置文件中的'RefreshActiveChecks' 参数设置的。 然而,如果刷新主动检查失败,则在60秒后重试。

agent然后定期向服务器发送新值。

获取监控项列表

Agent 请求

<HEADER><DATALEN>{
           "request":"active checks",
           "host":"<hostname>"
       }

Server 响应

<HEADER><DATALEN>{
           "response":"success",
           "data":[
               {
                   "key":"log[/home/zabbix/logs/zabbix_agentd.log]",
                   "delay":30,
                   "lastlogsize":0,
                   "mtime":0
               },
               {
                   "key":"agent.version",
                   "delay":600,
                   "lastlogsize":0,
                   "mtime":0
               },
               {
                   "key":"vfs.fs.size[/nono]",
                   "delay":600,
                   "lastlogsize":0,
                   "mtime":0
               }
           ]
       }

服务器必须响应成功。 对于每一个返回的监控项, 不管监控项是不是日志监控项,必须存在 key, delay, lastlogsize and mtime

例如:

  1. Agent 打开一个TCP连接
  2. Agent 请求检查清单
  3. Server 响应为监控项列表 (item key, delay)
  4. Agent 解析响应
  5. TCP 关闭连接
  6. Agent 开始定期收集数据

<note important>注意,在使用主动检查时,对于可以访问Zabbix服务器trapper端口的配置数据是可得到的。 这是可能的,因为任何一个都可以假装是一个主动agent,并请求项目配置数据; 除非你使用 加密 选项,否则认证不会发生 :::

发送收集的数据

Agent 发送

<HEADER><DATALEN>{
           "request":"agent data",
           "data":[
               {
                   "host":"<hostname>",
                   "key":"agent.version",
                   "value":"2.4.0",
                   "clock":1400675595,            
                   "ns":76808644
               },
               {
                   "host":"<hostname>",
                   "key":"log[/home/zabbix/logs/zabbix_agentd.log]",
                   "lastlogsize":112,
                   "value":" 19845:20140621:141708.521 Starting Zabbix Agent [<hostname>]. Zabbix 2.4.0 (revision 50000).",
                   "clock":1400675595,            
                   "ns":77053975
               },
               {
                   "host":"<hostname>",
                   "key":"vfs.fs.size[/nono]",
                   "state":1,
                   "value":"Cannot obtain filesystem information: [2] No such file or directory",
                   "clock":1400675595,            
                   "ns":78154128
               }
           ],
           "clock": 1400675595,
           "ns": 78211329
       }

Server 响应

<HEADER><DATALEN>{
           "response":"success",
           "info":"processed: 3; failed: 0; total: 3; seconds spent: 0.003534"
       }

<note important>如果在服务器上发送一些值失败(例如,因为主机或监控项被禁用或删除),agnet将不会重试发送这些值。 :::

例如:

  1. Agent 打开一个TCP连接
  2. Agent 发送一个值列表
  3. Server 处理数据并将状态返回
  4. TCP 连接关闭

注意,上面例子中怎么不支持 vfs.fs.size[/nono] 的状态由 "state" 值为 1 和 "value" 中的错误消息表示。

<note important>在服务器端,错误消息将被处理到2048个符号。 :::

Older XML protocol

Zabbix将占用16 MB的XML base64编码的数据, 但单个解码值应该不超过64kb,否则,在解码时将被截断到64 KB。

另请参阅

  1. 关于Zabbix agent协议的更多细节

3 Passive and active agent checks

Overview

This section provides details on passive and active checks performed by Zabbix agent.

Zabbix uses a JSON based communication protocol for communicating with Zabbix agent.

For definition of header and data length please refer to protocol details.

Passive checks

A passive check is a simple data request. Zabbix server or proxy asks for some data (for example, CPU load) and Zabbix agent sends back the result to the server.

Server request

<HEADER><DATALEN><item key>

Agent response

<HEADER><DATALEN><DATA>[\0<ERROR>]

Above, the part in square brackets is optional and is only sent for not supported items.

For example, for supported items:

  1. Server opens a TCP connection
  2. Server sends <HEADER><DATALEN>agent.ping
  3. Agent reads the request and responds with <HEADER><DATALEN>1
  4. Server processes data to get the value, '1' in our case
  5. TCP connection is closed

For not supported items:

  1. Server opens a TCP connection
  2. Server sends <HEADER><DATALEN>vfs.fs.size[/nono]
  3. Agent reads the request and responds with <HEADER><DATALEN>ZBX_NOTSUPPORTED\0Cannot obtain filesystem information: [2] No such file or directory
  4. Server processes data, changes item state to not supported with the specified error message
  5. TCP connection is closed

Active checks

Active checks require more complex processing. The agent must first retrieve from the server(s) a list of items for independent processing.

The servers to get the active checks from are listed in the 'ServerActive' parameter of the agent configuration file. The frequency of asking for these checks is set by the 'RefreshActiveChecks' parameter in the same configuration file. However, if refreshing active checks fails, it is retried after hardcoded 60 seconds.

The agent then periodically sends the new values to the server(s).

Getting the list of items

Agent request

<HEADER><DATALEN>{
           "request":"active checks",
           "host":"<hostname>"
       }

Server response

<HEADER><DATALEN>{
           "response":"success",
           "data":[
               {
                   "key":"log[/home/zabbix/logs/zabbix_agentd.log]",
                   "delay":30,
                   "lastlogsize":0,
                   "mtime":0
               },
               {
                   "key":"agent.version",
                   "delay":600,
                   "lastlogsize":0,
                   "mtime":0
               },
               {
                   "key":"vfs.fs.size[/nono]",
                   "delay":600,
                   "lastlogsize":0,
                   "mtime":0
               }
           ]
       }

The server must respond with success. For each returned item, all properties key, delay, lastlogsize and mtime must exist, regardless of whether item is a log item or not.

For example:

  1. Agent opens a TCP connection
  2. Agent asks for the list of checks
  3. Server responds with a list of items (item key, delay)
  4. Agent parses the response
  5. TCP connection is closed
  6. Agent starts periodical collection of data

Note that (sensitive) configuration data may become available to parties having access to the Zabbix server trapper port when using an active check. This is possible because anyone may pretend to be an active agent and request item configuration data; authentication does not take place unless you use encryption options.

Sending in collected data

Agent sends

<HEADER><DATALEN>{
           "request":"agent data",
           "data":[
               {
                   "host":"<hostname>",
                   "key":"agent.version",
                   "value":"2.4.0",
                   "clock":1400675595,            
                   "ns":76808644
               },
               {
                   "host":"<hostname>",
                   "key":"log[/home/zabbix/logs/zabbix_agentd.log]",
                   "lastlogsize":112,
                   "value":" 19845:20140621:141708.521 Starting Zabbix Agent [<hostname>]. Zabbix 2.4.0 (revision 50000).",
                   "clock":1400675595,            
                   "ns":77053975
               },
               {
                   "host":"<hostname>",
                   "key":"vfs.fs.size[/nono]",
                   "state":1,
                   "value":"Cannot obtain filesystem information: [2] No such file or directory",
                   "clock":1400675595,            
                   "ns":78154128
               }
           ],
           "clock": 1400675595,
           "ns": 78211329
       }

Server response

<HEADER><DATALEN>{
           "response":"success",
           "info":"processed: 3; failed: 0; total: 3; seconds spent: 0.003534"
       }

If sending of some values fails on the server (for example, because host or item has been disabled or deleted), agent will not retry sending of those values.

For example:

  1. Agent opens a TCP connection
  2. Agent sends a list of values
  3. Server processes the data and sends the status back
  4. TCP connection is closed

Note how in the example above the not supported status for vfs.fs.size[/nono] is indicated by the "state" value of 1 and the error message in "value" property.

Error message will be trimmed to 2048 symbols on server side.

Older XML protocol

Zabbix will take up to 16 MB of XML Base64-encoded data, but a single decoded value should be no longer than 64 KB otherwise it will be truncated to 64 KB while decoding.