Hi guys,
First time poster here. I've spent about a week trying to figure this out now. Long story short every 12 hours, Zabbix locks up while a massive amount of reads are done on the database. I've been unable to determine why or what would cause this. My first suspicion was housekeeper as from what I've read that's a common culprit. It seemed weird that this isn't happening every time housekeeper ran, but just once every 12 hours. Anyways, I disabled housekeeper temporarily for history and trends to test, and the problem continues.
I'll try to post everything that may be relevant, but feel free to ask for anything else I can provide.
CentOS 8.3.2011 VM on ESXi
4 CPU cores
8 GB RAM
200 GB drive on SSD backed SAN Datastore
Zabbix/MySQL Versions:
I'm attaching various graphs from Zabbix that show what's going on both over the course of a couple days, and then in detail during the issue. Every 12ish hours, there's a massive spike in read I/O for 5-10 minutes that causes nodata() triggers to trip and the WebUI to become unresponsive. Zabbix logs also show slow insert operations to the database. The Zabbix process pegged during this is history syncer. From my understanding history syncer should be writing history to the database however and not reading.
The database is 27 G
Performance is absolutely fine except during these events. Does history syncer do something every 12 hours it isn't doing all the time? I was considering partitioning the database, but we'd like to keep different item values in history for different periods, which I understand is not an option with partitioning. After disabling housekeeper and the same problem happening I'm skeptical that's the issue anyways.
I don't think it's relevant, but just in case, this started happening on Jan 18th, which was after an upgrade of various packages (see attached "OS Upgrade on 18th.txt" for dnf/yum history of update).
Also attaching MySQL log, which definitely shows contention during the event. Zabbix is the only database present.
I would appreciate any help or insight on things to try.
Thanks
First time poster here. I've spent about a week trying to figure this out now. Long story short every 12 hours, Zabbix locks up while a massive amount of reads are done on the database. I've been unable to determine why or what would cause this. My first suspicion was housekeeper as from what I've read that's a common culprit. It seemed weird that this isn't happening every time housekeeper ran, but just once every 12 hours. Anyways, I disabled housekeeper temporarily for history and trends to test, and the problem continues.
I'll try to post everything that may be relevant, but feel free to ask for anything else I can provide.
CentOS 8.3.2011 VM on ESXi
4 CPU cores
8 GB RAM
200 GB drive on SSD backed SAN Datastore
Zabbix/MySQL Versions:
Code:
[FONT=monospace][COLOR=#000000]$ dnf list installed | grep zabbix [/COLOR] fping.x86_64 3.16-1.el8 @[COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-non-supported [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-agent.x86_64 5.2.4-1.el8 @[/COLOR][COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000] [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-apache-conf.noarch 5.2.4-1.el8 @[/COLOR][COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000] [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-release.noarch 5.2-1.el8 @System [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-server-mysql.x86_64 5.2.4-1.el8 @[/COLOR][COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000] [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-web.noarch 5.2.4-1.el8 @[/COLOR][COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000] [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-web-deps.noarch 5.2.4-1.el8 @[/COLOR][COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000] [/COLOR] [COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000]-web-mysql.noarch 5.2.4-1.el8 @[/COLOR][COLOR=#ff5454][B]zabbix[/B][/COLOR][COLOR=#000000] [/COLOR] $ dnf list installed | grep mariadb [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000].x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-backup.x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-common.x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-connector-c.x86_64 3.1.11-2.el8_3 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-connector-c-config.noarch 3.1.11-2.el8_3 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-errmsg.x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-gssapi-server.x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-server.x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream [/COLOR] [COLOR=#ff5454][B]mariadb[/B][/COLOR][COLOR=#000000]-server-utils.x86_64 3:10.3.27-3.module_el8.3.0+599+c587b2e7 @appstream[/COLOR][/FONT]
The database is 27 G
Code:
[FONT=monospace][COLOR=#000000]$ sudo du -sh * /var/lib/mysql | grep zabbix [/COLOR] 27G [COLOR=#ff5454][B]zabbix[/B][/COLOR][/FONT]
I don't think it's relevant, but just in case, this started happening on Jan 18th, which was after an upgrade of various packages (see attached "OS Upgrade on 18th.txt" for dnf/yum history of update).
Also attaching MySQL log, which definitely shows contention during the event. Zabbix is the only database present.
I would appreciate any help or insight on things to try.
Thanks
Comment