Ad Widget

Collapse

postgresql log file shows error "invalid page in block" in history_uint table

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tvtue
    Member
    • Sep 2012
    • 71

    #1

    postgresql log file shows error "invalid page in block" in history_uint table

    Dear Zabbix Users,

    we are having a problem with our Zabbix database. The log file of the postgresql db shows

    Code:
    2019-04-26 23:14:38.697 CEST [29676] ERROR:  invalid page in block 1211023 of relation base/16386/81870
    2019-04-26 23:14:38.697 CEST [29676] STATEMENT:  COPY public.history_uint (itemid, clock, value, ns) TO stdout;
    The error appears when the cron job is beeing run which does a pg_dump of the Zabbix DB to make a regular backup.

    I have searched around and found people saying that this error normally occurs if you have hardware problems. In this case Zabbix runs as a virtual machine (kvm) and gets it's disks as raw image files from the hypervisor. I checked the file system within the vm (xfs) and it had no problems. I also checked the underlying file system of the hypervisor (zfs) which is ok, too. A weekly scrub job runs there. No errors.

    I tried to fix the problem by doing zeroing out the broken pages in postgresql.

    Code:
    psql zabbix
    SET zero_damaged_pages = on;
    VACUUM FULL VERBOSE ANALYZE history_uint;
    REINDEX TABLE history_uint;
    These commands corrected the problems. Afterwards I was able to do a clean pg_dump again.

    Then after a few days the problem reoccured. I did the vacuum full commands again and I also created a new disk image for the vm, formatted it with ext4 and copied over the db file system. I also used new mount options (nobarrier, noatime).

    Now, after another few days, the problem is here again. Maybe someone has a hint how to solve this problem?


    It's a postgresql db version 11 on a CentOS7 system. (repo from postgresl.org)
    /var/lib/pgsql is on a separate file system ext4, 200GB with 63 GB used
    Zabbix Server 4.0.7 (repo from zabbix.com)
    Hypervisor: CentOS7, Disk image space is a ZFS pool, raidz1 with 6 SSDs.
    96 GB ECC RAM is beeing used in the hypervisor.


    Cheers and many thanks
    Timo
  • benoitf
    Junior Member
    • Nov 2019
    • 2

    #2
    Hello,

    I recently encounter the same error, with a very different system: Zabbix 4.0.4 with PostgreSQL 11.5, but on a Raspberry Pi 3b+ (ARMv7, 1GB RAM, 16GB sdcard).
    I have to restore from backup.
    I first thought the sdcard had physical errors, and I moved data to an USB storage. The issues keeps coming.
    I truncated history* and trends*, without more success.

    I cannot say when this issue appeared, but my system worked like a charm for many months before that.

    Any hint would be appreciated.

    Thanks,
    Benoit

    Comment

    • benoitf
      Junior Member
      • Nov 2019
      • 2

      #3
      Hello,
      For the records, a full reinstall of the Raspberry does solve my issue: Zabbix + PostgreSQL run smoothly again.

      Benoit

      Comment

      Working...