Ad Widget

**cstackpole** · 15-08-2020, 15:44

Greetings,

Well after a few hours of digging into this issue, I've got a few updates.

1. Still broken. Thoughts/advice appreciated.

2. From the errors, it looks like the pg_upgrade process had issues that it didn't flag. I found several reports of people all moving from 9.x to 11 who had similar errors.

history_uint was completely shot. Only the first 1654 entries were good. The rest were just broken. I spent an hour cleaning rows with errors one by one and still had less than 1700 rows of an unknown number (I couldn't count the entries because the table was so broken). So I just took the hit and dumped the contents of the entire table. The primary error that I was hitting here was "MultiXactId XXXXXXXXX has not been created yet -- apparent wraparound" which required me to find the bad row with an offset, then look up the ctid to delete it. Some of them, I couldn't delete because it didn't exist so I had to force an entry then delete it. The process was slow and painful and not worth it for me on this home system.

history also had corruption. However, I was able to use the function chk from https://dba.stackexchange.com/questi...corrupt-errors to find and delete all the bad ctid entries. THEN I was getting errors about NULL statements. There were 10 entries in history that had NULL that shouldn't have (probably further corruption). THEN I got errors that "ERROR: row is too big: size 51792, maximum size 8160" so I sorted by clock entry and found a bunch of entries that were negative values, or 0, or 1, or whatever. So I went and found what looked like the start of actual clock entries (a big 8 digit number). I deleted everything smaller than it. That cleaned up 100,000,000 entries! (and that was only like a small portion of the entries). THEN I got errors about indexes. GRRR!!! But, the suggestion I found online was to run a full vacuum. It found bad entries that were annoying because it died after finding one...so after EIGHT long slow attempts where I just deleted the bad entry it finally finished. FINALLY!!

Now that both of those tables are "clean", I reran the create_hypertable on both (from the zabbix provided script). AND IT WORKED! Whoo!!!

Now all the errors from Zabbix_server.log are gone. Whew!

3. With all that cleaned up, this odd behavior is gone. However, some items aren't updating at all except on a restart of zabbix-server. Still digging into this.

I still have issues I'd like thoughts on. However, this is just further reinforcement to have a clean backup before you start this process (I have one, I just wanted the learning process of fighting through it), and verify data integrity AFTER an upgrade of the DB BEFORE starting to add new things to it.

Thanks!

Ad Widget

Errors after upgrading Postgres and TimescaleDB

Errors after upgrading Postgres and TimescaleDB

Comment