Ad Widget

Collapse

No discoveries run : LLD worker at 100% and LLD queue growing indefinitely

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • william_
    Junior Member
    • Dec 2024
    • 6

    #1

    No discoveries run : LLD worker at 100% and LLD queue growing indefinitely

    Hello,

    Since Thursday, we’ve noticed that our discoveries were not creating new items / trigger / hosts. Trying to find a cause, we saw that the item “Utilization of LLD worker internal processes” had a value of 100%. Moreover, the LLD queue was huge, like 30K and growing rapidly.

    Trying to mitigate the issue, we’ve changed the update interval of all the discoveries (many of them were 5 minutes or less) to 1 hour.
    During the weekend, the queue was still growing and on Monday it was over 80K. We restarted the Zabbix_server daemon, the queue dropped to 0, but the LLD queue and LLD worker went high again.
    As I’m writing the LLD queue is over 35K and the LLD worker internal process is still at 100%.
    As the same time, we saw in the logs a recurring slow query, but it might not be related to this issue :
    Code:
    2781419:20241203:074001.408 slow query: 3.169667 sec, "update ids set nextid=nextid+107 where table_name='items' and field_name='itemid'"
    2781437:20241203:074005.565 slow query: 3.838815 sec, "update ids set nextid=nextid+107 where table_name='items' and field_name='itemid'"
    2781448:20241203:074218.986 slow query: 3.254259 sec, "update ids set nextid=nextid+107 where table_name='items' and field_name='itemid'"
    Do you have any ideas on this?
    Don’t hesitate to ask me for more insight if needed.
    Thank you for your time.
    Regards.
    --
    Tech info :
    OS (srv and DB) : RHEL 8.7
    Zabbix : 6.4.18
    PostgreSQL 14.5

    /etc/zabbix/zabbix_server.conf :
    Code:
    StartLLDProcessors=80​
    Attached Files
  • william_
    Junior Member
    • Dec 2024
    • 6

    #2
    Someone have rebooted the entire server at night, and the problem seems to be resolved.
    LLD queue is now OK (avg 0.7, max 21), worker is mostly idle and the slow queries are gone.

    I can't explain what happened but the server reboot instead of just the Zabbix Server daemon restart seems to have cleaned something (DB connections ? ghost process cache ?).

    Should I mark the thread as [Resolved] ?

    Comment

    Working...