Ad Widget

**tim.mooney** · 16-07-2020, 21:50

I haven't upgraded our site to Zabbix 5.x yet, mainly because any major upgrade always has unexpected issues and I wanted to wait for other people to find and report those. I initially thought I would wait for a 5.0.3 or 5.0.4 release before updating, but I may wait a little longer than that.

I have however already upgraded our Zabbix 4.4.x install to use MariaDB 10.2.x (on RHEL 7.x), as part of my prep-work for getting ready for Zabbix 5.x. I also fixed up some issues with the schema that had been introduced by product updates over the years, and I converted all of the tables to use the new row_format=DYNAMIC. At least for my site's relatively small Zabbix install, the move from MariaDB 5.5.x to 10.2.x hasn't introduced any performance issues that I've noticed. It's possible that I just haven't noticed them because my site's DB is lightly loaded, but I think it's more likely that 10.2.x is working just fine with Zabbix 4.4.x.

There are certainly big changes between MariaDB 5.5 and MariaDB 10.x, but most of the reports I've seen, including comments from Zabbix developers, are that MariaDB 10.x has some beneficial performance enhancements over MariaDB 5.x. There can always be outliers, but the general impression I've gotten from the research I've done indicates that MariaDB 10.x performance should be the same or slightly better in many cases, and potentially much better in a few very specific cases.

Since 5.x was released, there have been a few reports on these forums about performance issues with 5.x. I don't know this for certain (again: not running Zabbix 5.x yet), but there have been enough reports here that I'm suspicious that there are some query issues that are new with 5.x. If my memory serves, I think some of the posts about performance issues were with MySQL/MariaDB as a backend, and some were with PostgreSQL as a backend, so if that's true, that furthers my guess that there may be some query optimization or changes that are still needed for the new release.

It's totally possible that the issues you're seeing are indeed something specific to your install or some bit of misconfiguration with your MariaDB install, but my speculation at this point is that the issue is something specific to Zabbix 5.x.

**jameskirsop** · 17-07-2020, 01:14

Thanks, Tim. Your whole post is rather reassuring. It's good to know that you're on 4.4 with MDB 10.2 without any issues.

Originally posted by tim.mooney

I have however already upgraded our Zabbix 4.4.x install to use MariaDB 10.2.x (on RHEL 7.x), as part of my prep-work for getting ready for Zabbix 5.x. I also fixed up some issues with the schema that had been introduced by product updates over the years, and I converted all of the tables to use the new row_format=DYNAMIC.

Do you have a series of specific steps / queries that you ran to make these changes?? That might help in getting some of my issues resolved and make Zabbix more efficient.

One thing I forgot to mention is the collation of my tables doesn't match the one supported by Zabbix, but I've not had any prior performance problems because of that so I've not been too concerned to address it until now. The specific error message is:

Code:

53633:20200716:134608.250 Zabbix supports only "utf8_bin" collation. Database "zabbix_prod" has default collation "utf8_general_ci"
53633:20200716:134608.293 character set name or collation name that is not supported by Zabbix found in 421 column(s) of database "zabbix_prod"
53633:20200716:134608.294 only character set "utf8" and collation "utf8_bin" should be used in database

I've just also noticed some 'query failed: [1205] Lock wait timeout exceeded' errors in the Zabbix log (often happening on the hosts table and early in the morning, so not when I've noticed issues when at my desk). Given the times of the errors don't usually correlate to when I'm seeing other issues, I'm not convinced they're related.

**tim.mooney** · 17-07-2020, 03:14

Originally posted by jameskirsop

One thing I forgot to mention is the collation of my tables doesn't match the one supported by Zabbix, but I've not had any prior performance problems because of that so I've not been too concerned to address it until now. The specific error message is:

Code:

53633:20200716:134608.250 Zabbix supports only "utf8_bin" collation. Database "zabbix_prod" has default collation "utf8_general_ci"
53633:20200716:134608.293 character set name or collation name that is not supported by Zabbix found in 421 column(s) of database "zabbix_prod"
53633:20200716:134608.294 only character set "utf8" and collation "utf8_bin" should be used in database

The collation requirements were more strictly spelled out starting with the 3.2 upgrade notes (see the section about "Case Sensitive MySQL"), and the software started warning about incorrect collation late in the 4.4.x series. It was when I was preparing to go from 3.0 through 3.2 to 4.2 that I started the process of fixing the collation of my install. I used an earlier version of the procedure from https://support.zabbix.com/browse/ZBX-17357 . Unfortunately, the early versions of that procedure had some serious problems (loss of default values for a bunch of columns because of an error in the process), and I discovered that after upgrading to 4.2.x. That's what got me looking closely at my database schema, and it's when I discovered that because my schema had been upgraded many times over the years, it had accumulated some "cruft".

I think the procedure described at the end of the ZBX-17357 is now pretty good for fixing an existing collation, though last I heard, there were still some private or hidden comments on that issue that were causing some people to wonder if there might still be problems with the procedure.

Originally posted by jameskirsop

Do you have a series of specific steps / queries that you ran to make these changes?? That might help in getting some of my issues resolved and make Zabbix more efficient.

I'm not suggesting that any of the schema differences my site had or that your site might have are causing the performance issues you're seeing. I think that's kind of unlikely, though not impossible. I did my investigation and cleanup to make my site's database as close to "pristine" as possible, with the belief that it would probably improve my chances of a successful 4.4.x to 5.x upgrade, and that it would also potentially improve the success rate for my future Zabbix upgrades too.

I've posted a general overview of what I did to clean up my DB schema in a couple of posts,

We’ll be back soon!

https://www.zabbix.com/forum/zabbix-troubleshooting-and-problems/404693-database-upgrade-fail-4-2-3-to-4-4-10-or-5-0-1-same-result?p=404835#post404835

We’ll be back soon!

https://www.zabbix.com/forum/zabbix-troubleshooting-and-problems/404273-zabbix-server-4-4-8-to-5-0-database-migration-failure?p=404322#post404322

We’ll be back soon!

https://www.zabbix.com/forum/zabbix-troubleshooting-and-problems/403139-upgrade-database-failed-after-ugrade-to-zabbix-5?p=403201#post403201

It's basically "create fresh schema of the exact same version on a different host and then diff them" followed by "fix any differences".

Note that the absolute easiest way to ensure a correct schema would be to create a new, fresh database (with the correct charset and collation) and then import your data (making sure to not import any schema statements that would recreate the old schema) from your existing database. I didn't take that approach because I believed by fixing the existing database I would have less downtime. That probably turned out to be true, but it was probably closer in downtime to a full dump & reload than I was expecting. Probably the biggest benefit to the route I chose is that I learned quite a bit about recent MariaDB and I improved my knowledge of the Zabbix schema. It wasn't a "simple" procedure, but I learned a bunch of stuff in the process that may be generally useful for me with other projects, so I'm still happy with the choice.

**jameskirsop** · 20-07-2020, 08:59

Originally posted by tim.mooney

Since 5.x was released, there have been a few reports on these forums about performance issues with 5.x. I don't know this for certain (again: not running Zabbix 5.x yet), but there have been enough reports here that I'm suspicious that there are some query issues that are new with 5.x. If my memory serves, I think some of the posts about performance issues were with MySQL/MariaDB as a backend, and some were with PostgreSQL as a backend, so if that's true, that furthers my guess that there may be some query optimization or changes that are still needed for the new release.

I'm now wondering why Zabbix is trying to run multiple queries several times. This row in the mariadb process list shows up 5 times,
| 251673 | zabbix_server | localhost | zabbix_prod | Query | 26 | updating | DELETE FROM sessions WHERE status='1' AND userid='5' | 0.000 |

I'm also seeing periods where there are multiple identical updates / inserts into the sessions table when I try and log in.

Next step is to try and find some ways of reliably reproducing and logging a bug (or several...)

**jameskirsop** · 24-07-2020, 03:36

tim.mooney do you happen to remember what schema differences you found when doing your clean up?

Apart from the COMPACT vs DYNAMIC row types, and the collation changes, I've done a quick scan through mysqldump'ed schema and haven't been able to find any discrepancies. Unfortunately a diff won't work because the outputs of mysql dump and the 5.0.2 schema generated via a zcat create.sql.gz > filename.sql don't resemble each other particularly well (constraints are in a different location in the file, line breaks are different etc)

In my scan I took a look at the larger tables in my instance (history_uint/history_text/trends_uint etc) and their constraints + some other tables such as users and groups.

**tim.mooney** · 24-07-2020, 05:01

Originally posted by jameskirsop

tim.mooney do you happen to remember what schema differences you found when doing your clean up?

The process started because I had to fix up all the columns that had lost their DEFAULT value because of the wayward procedure to convert the COLLATIONs. You won't have that issue because you haven't done the conversion yet and presumably the procedure you'll use will work much better.

The other difference was duplicate indexes (KEYs) in some tables that did not appear in a freshly-created schema. I'm not certain whether these were causing any issues or not, but they were present in my much-upgraded schema and not present in a freshly-created schema, so I got rid of them. I still had some CONSTRAINTs in my schema that were dumped in a different order than the same CONSTRAINT on a new table, but that's not a difference that's important.

For example, here's how a freshly-created `drules` table (at 4.4.7, the version I'm currently running) looks:

Code:

DROP TABLE IF EXISTS `drules`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `drules` (
`druleid` bigint(20) unsigned NOT NULL,
`proxy_hostid` bigint(20) unsigned DEFAULT NULL,
`name` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`iprange` varchar(2048) COLLATE utf8_bin NOT NULL DEFAULT '',
`delay` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '1h',
`nextcheck` int(11) NOT NULL DEFAULT 0,
`status` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`druleid`),
UNIQUE KEY `drules_2` (`name`),
KEY `drules_1` (`proxy_hostid`),
CONSTRAINT `c_drules_1` FOREIGN KEY (`proxy_hostid`) REFERENCES `hosts` (`hostid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
/*!40101 SET character_set_client = @saved_cs_client */;

Here's how my much-upgraded `drules` looked, before I got rid of the extra indexes:

Code:

DROP TABLE IF EXISTS `drules`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `drules` (
`druleid` bigint(20) unsigned NOT NULL,
`proxy_hostid` bigint(20) unsigned DEFAULT NULL,
`name` varchar(255) COLLATE utf8_bin NOT NULL,
`iprange` varchar(2048) COLLATE utf8_bin NOT NULL,
`delay` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '1h',
`nextcheck` int(11) NOT NULL DEFAULT 0,
`status` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`druleid`),
UNIQUE KEY `drules_2` (`name`),
KEY `c_drules_1` (`proxy_hostid`),
KEY `drules_1` (`proxy_hostid`),
CONSTRAINT `c_drules_1` FOREIGN KEY (`proxy_hostid`) REFERENCES `hosts` (`hostid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
/*!40101 SET character_set_client = @saved_cs_client */;

Note the extra "c_drules_1" KEY in mine, vs. the freshly created? That matches up with the FOREIGN KEY CONSTRAINT `c_drules_1`, and indeed FOREIGN KEY CONSTRAINTs must be indexed, but I had 2 indexes (KEYs) for the same column. I think the "c_drules_1" was an earlier auto-generated KEY. Either way, it's not present in the fresh schema, so I got rid of it.

Across all the tables in my database, there were 42 cases of these duplicate KEY (index) statements that didn't appear in a fresh schema.

**kitadisa** · 26-07-2020, 12:23

Originally posted by jameskirsop

tim.mooney do you happen to remember what schema differences you found when doing your clean up?

Apart from the COMPACT vs DYNAMIC row types, and the collation changes, I've done a quick scan through mysqldump'ed schema and haven't been able to find any discrepancies. Unfortunately a diff won't work because the outputs of mysql dump and the 5.0.2 schema generated via a zcat create.sql.gz > filename.sql don't resemble each other particularly well (constraints are in a different location in the file, line breaks are different etc)

In my scan I took a look at the larger tables in my instance (history_uint/history_text/trends_uint etc) and their constraints + some other tables such as users and groups.

Thanks, Tim. Your whole post is rather reassuring. It's good to know that you're on 4.4 with MDB 10.2 without any issues.

**jameskirsop** · 31-07-2020, 02:12

An update, I've managed to isolate the long running queries that I mentioned in my original post to an issue with the Problems widget. This has been reported here: https://support.zabbix.com/browse/ZBX-17866

Sadly the issue is marked as 'trivial' even though it actually severely degrades the performance of my Zabbix instance rendering it unstable at times.

**tim.mooney** · 31-07-2020, 03:48

Thank you for doing the work to isolate it and report it!

Hopefully in this case they've used "trivial" to mean it's easy to fix, rather than that they don't consider it a serious problem.

Ad Widget

Performance issues after Zabbix 5.0 and MariaDB Upgrade

Performance issues after Zabbix 5.0 and MariaDB Upgrade

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment