Ad Widget
Collapse
Zabbix not write data to DB postgresql, grow queue
Collapse
X
-
-
That's what I meant. Whether this attribute is set for SNMP interfaces.
For diagnostics, try increasing the logging level of the running serverAnd after a while, check the log for any errors or problems.Code:zabbix_server -R log_level_increase=poller
From the part of the z_server and psql config, it follows that the ip stack and the localhost address are used to interact with the database... Maybe it makes sense to switch to interaction via socket?
PS I will never believe that there are no errors in the log with such a queue! I'm sure there are reports of nodes being unavailable.
And by the way! Based on what information, the conclusion is made about the insufficient performance of the database?Last edited by Hamardaban; 25-02-2020, 14:28.Comment
-
-
-
I run
/usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf -R log_level_decrease="unreachable poller"
/usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf -R log_level_decrease="poller"
/usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf -R log_level_decrease="discoverer"
I have a highbusy on "unreachable poller","poller","discoverer" but no CPU load
LOG FILE: 200Mb https://drive.google.com/open?id=1QS...C57p7kORAa7vEO
please help me find a Bottlenecks
because i not see :-(Last edited by user.zabbix; 25-02-2020, 22:23.Comment
-
If TimescaleDB is used - disable housekeeping!
Indeed, there are slow queries - this is not correct.
And the worst part is the messages " unreachable poller #XX [got Y values in Z sec...". That's what needs to be dealt with... I advise you to disable debagging and enable it only for one poller. Then look at his mistakes more deeply
UP
don’t disable HK - its used tmdb nice!Last edited by Hamardaban; 30-11-2020, 07:43.Comment
-
I tried to execute a query in my system that is executed slowly in your system, but I could not because the table schema is different! You had a message about updating the system to 4.4.5 - and now what version is it?
And "The Global Recommendation" is to upgrade to the current branch - 4.4 ....Last edited by Hamardaban; 26-02-2020, 11:32.Comment
-
our slow query described in BUG
in our case this query return 683679 rowLast edited by user.zabbix; 26-02-2020, 15:15.Comment
-
"You had a message about updating the system to 4.4.5 - and now what version is it?" alredy upgrated to 4.4.6 :-(
zabbix=> explain analyze
zabbix-> select i.itemid,i.hostid,i.status,i.type,i.value_type,i.k ey_,i.snmp_community,i.snmp_oid,i.port,i.snmpv3_se curityname,i.snmpv3_securitylevel,i.snmpv3_authpas sphrase,i.snmpv3_privpassphrase,i.ipmi_sensor,i.de lay,i.trapper_hosts,i.logtimefmt,i.params,ir.state ,i.authtype,i.username,i.password,i.publickey,i.pr ivatekey,i.flags,i.interfaceid,i.snmpv3_authprotoc ol,i.snmpv3_privprotocol,i.snmpv3_contextname,ir.l astlogsize,ir.mtime,i.history,i.trends,i.inventory _link,i.valuemapid,i.units,ir.error,i.jmx_endpoint ,i.master_itemid,i.timeout,i.url,i.query_fields,i. posts,i.status_codes,i.follow_redirects,i.post_typ e,i.http_proxy,i.headers,i.retrieve_mode,i.request _method,i.output_format,i.ssl_cert_file,i.ssl_key_ file,i.ssl_key_password,i.verify_peer,i.verify_hos t,i.allow_traps,i.templateid,id.parent_itemid from items i inner join hosts h on i.hostid=h.hostid left join item_discovery id on i.itemid=id.itemid join item_rtdata ir on i.itemid=ir.itemid where h.status in (0,1) and i.flags<>2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=110.35..108569.35 rows=595249 width=252) (actual time=0.755..2047.580 rows=683653 loops=1)
Hash Cond: (i.hostid = h.hostid)
-> Merge Left Join (cost=1.54..106733.79 rows=656006 width=252) (actual time=0.061..1757.813 rows=683653 loops=1)
Merge Cond: (i.itemid = id.itemid)
-> Merge Join (cost=1.11..69969.33 rows=656006 width=244) (actual time=0.034..1023.994 rows=683653 loops=1)
Merge Cond: (i.itemid = ir.itemid)
-> Index Scan using items_pkey on items i (cost=0.42..43460.18 rows=686284 width=230) (actual time=0.013..357.460 rows=685703 loops=1)
Filter: (flags <> 2)
Rows Removed by Filter: 33354
-> Index Scan using item_rtdata_pkey on item_rtdata ir (cost=0.42..17714.01 rows=689695 width=22) (actual time=0.007..132.160 rows=683653 loops=1)
-> Index Only Scan using item_discovery_1 on item_discovery id (cost=0.42..29648.88 rows=704873 width=16) (actual time=0.024..357.422 rows=704873 loops=1)
Heap Fetches: 699022
-> Hash (cost=91.30..91.30 rows=1401 width=8) (actual time=0.670..0.670 rows=1401 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 71kB
-> Seq Scan on hosts h (cost=0.00..91.30 rows=1401 width=8) (actual time=0.011..0.429 rows=1401 loops=1)
Filter: (status = ANY ('{0,1}'::integer[]))
Rows Removed by Filter: 143
Planning Time: 2.355 ms
Execution Time: 2074.530 ms
Comment
-
-
Number of processed not supported values per second
26.02.2020 15:42:58 130.4489
Notsupported items 59020Last edited by user.zabbix; 26-02-2020, 15:50.Comment
-
Comment
-
The queue for snmp requests is the difference between the number of requested values for data items and the number of responses received. Large queue = data asked a lot \ received a little.
Why is that? So we're trying to figure it out..... The network may be poorly designed, or the network stack may be overloaded on the server itself.
Here are the specific steps that I advise you to do:
1) go to work with the database via sockets
2) remove housekeeping
3) increase the number of process poller and Unreachable poller
4) try to transfer some devices to work through a proxyComment
-
-
Hello all,
A few months ago I installed a large set of zabbix servers (3 apps with zabbix v5.0, 1 database server with postgresql v11 and 3 proxies.)
The system runs generally with 500 nvps and the proxies with almost 70/80 nvps.
Unfortunately I have almost the same problem. The proxy queues are growing and I cant find the reason.
For each of the proxies at least 1000 to 2000 waiting items are showing up.
Comment
Comment