I have a Zabbix Server running locally and a Zabbix Proxy and Agents (passive mode) installed in my Kubernetes cluster using the helm chart (everything version 7.2).
Currently there are ~ 5000 items collected with the Proxy.
The problem is, the proxy queue is always filling up after some time so I do not get any more values from the cluster. When I restart the proxy pod, the problem is gone for 18 - 48 hours, then the queue is filling again. I checked the poller utilization and cache size, I can not find the problem. Everything else is working fine on the Zabbix Server, the queue only fills for the proxy.
I also checked the server and proxy logs, but there was nothing.
Here are some proxy statistics from the moment where the queue started filling:
Utilization of agent poller data collector processes: 1.2%
Utilization of browser poller data collector processes: 0%
Utilization of http agent poller data collector processes: 0.17%
Utilization of http poller data collector processes: 0.17%
Utilization of internal poller data collector processes: 0.02%
Utilization of ODBC poller data collector processes: 0%
Utilization of poller data collector processes: 0.002%
Utilization of snmp poller data collector processes: 0%
Utilization of unreachable poller data collector processes: 0%
Configuration cache: 16.6%
History index cache: 0.36%
History write cache: 0%
Utilization of trapper data collector processes: 1.5%
And some statistics from the server:
Utilization of agent poller data collector processes: 17.67%
Utilization of browser poller data collector processes: 0%
Utilization of history poller internal processes: 0.14%
Utilization of http agent poller data collector processes: 6.2%
Utilization of http poller data collector processes: 8.7%
Utilization of internal poller data collector processes: 16%
Utilization of ODBC poller data collector processes: 0%
Utilization of poller data collector processes: 0.1%
Utilization of proxy poller data collector processes: 12.7%
Utilization of snmp poller data collector processes: 0%
Utilization of unreachable poller data collector processes: 0%
Utilization of trapper data collector processes: 0.02%
Configuration cache: 58.8%
History index cache: 10.5%
History write cache: 5.5%
Trend function cache: 0%
Trend write cache: 0%
Value cache: 28.7%
Value cache misses: 0
My values.yaml looks like this:
Currently there are ~ 5000 items collected with the Proxy.
The problem is, the proxy queue is always filling up after some time so I do not get any more values from the cluster. When I restart the proxy pod, the problem is gone for 18 - 48 hours, then the queue is filling again. I checked the poller utilization and cache size, I can not find the problem. Everything else is working fine on the Zabbix Server, the queue only fills for the proxy.
I also checked the server and proxy logs, but there was nothing.
Here are some proxy statistics from the moment where the queue started filling:
Utilization of agent poller data collector processes: 1.2%
Utilization of browser poller data collector processes: 0%
Utilization of http agent poller data collector processes: 0.17%
Utilization of http poller data collector processes: 0.17%
Utilization of internal poller data collector processes: 0.02%
Utilization of ODBC poller data collector processes: 0%
Utilization of poller data collector processes: 0.002%
Utilization of snmp poller data collector processes: 0%
Utilization of unreachable poller data collector processes: 0%
Configuration cache: 16.6%
History index cache: 0.36%
History write cache: 0%
Utilization of trapper data collector processes: 1.5%
And some statistics from the server:
Utilization of agent poller data collector processes: 17.67%
Utilization of browser poller data collector processes: 0%
Utilization of history poller internal processes: 0.14%
Utilization of http agent poller data collector processes: 6.2%
Utilization of http poller data collector processes: 8.7%
Utilization of internal poller data collector processes: 16%
Utilization of ODBC poller data collector processes: 0%
Utilization of poller data collector processes: 0.1%
Utilization of proxy poller data collector processes: 12.7%
Utilization of snmp poller data collector processes: 0%
Utilization of unreachable poller data collector processes: 0%
Utilization of trapper data collector processes: 0.02%
Configuration cache: 58.8%
History index cache: 10.5%
History write cache: 5.5%
Trend function cache: 0%
Trend write cache: 0%
Value cache: 28.7%
Value cache misses: 0
My values.yaml looks like this:
Code:
kube-state-metrics:
enabled: false
zabbixAgent:
image:
repository: zabbix/zabbix-agent2
tag: alpine-7.2.0
extraVolumeMounts:
- name: <hidden>
mountPath: /var/lib/zabbix/enc
readOnly: true
extraVolumes:
- name: <hidden>
secret:
secretName: <hidden>
env:
- name: ZBX_PASSIVE_ALLOW
value: true
- name: ZBX_PASSIVESERVERS
value: 0.0.0.0/0
- name: ZBX_ACTIVE_ALLOW
value: false
- name: ZBX_TLSCONNECT
value: cert
- name: ZBX_TLSACCEPT
value: cert
- name: ZBX_TLSCAFILE
value: <hidden>
- name: ZBX_TLSSERVERCERTISSUER
value: <hidden>
- name: ZBX_TLSSERVERCERTSUBJECT
value: <hidden>
- name: ZBX_TLSCERTFILE
value: <hidden>
- name: ZBX_TLSKEYFILE
value: <hidden>
zabbixProxy:
enabled: true
resources: { }
image:
repository: zabbix/zabbix-proxy-sqlite3
tag: alpine-7.2.0
service:
type: LoadBalancer
annotations:
<hidden>
extraVolumeMounts:
- name: <hidden>
mountPath: /var/lib/zabbix/enc
readOnly: true
extraVolumes:
- name: <hidden>
secret:
secretName: <hidden>
env:
- name: ZBX_HOSTNAME
value: zabbix-proxy
- name: ZBX_PROXYMODE
value: 1 # passive proxy
- name: ZBX_SERVER_HOST
value: <hidden>
- name: ZBX_TLSCONNECT
value: cert
- name: ZBX_TLSACCEPT
value: cert
- name: ZBX_TLSCAFILE
value: <hidden>
- name: ZBX_TLSSERVERCERTISSUER
value: <hidden>
- name: ZBX_TLSSERVERCERTSUBJECT
value: <hidden>
- name: ZBX_TLSCERTFILE
value: <hidden>
- name: ZBX_TLSKEYFILE
value: <hidden>
- name: ZBX_STARTPOLLERS
value: 10
- name: ZBX_STARTHTTPPOLLERS
value: 25
- name: ZBX_STARTHISTORYPOLLERS
value: 15
- name: ZBX_STARTPOLLERSUNREACHABLE
value: 2
- name: ZBX_CACHESIZE
value: 32M
Comment