Fix infinite wait when reading a partially written WAL record
authorAlexander Korotkov <akorotkov@postgresql.org>
Sat, 19 Jul 2025 10:44:01 +0000 (13:44 +0300)
committerAlexander Korotkov <akorotkov@postgresql.org>
Sat, 19 Jul 2025 22:29:14 +0000 (01:29 +0300)
commitc9f4e7520603836b924a218a34633f545c0173f3
tree2ecae9e310be5fdf3697949fc76da88a61eef9c9
parentfd39c3cf28396d1fbb8b2a2cdb9fe66b6ad87964
Fix infinite wait when reading a partially written WAL record

If a crash occurs while writing a WAL record that spans multiple pages, the
recovery process marks the page with the XLP_FIRST_IS_OVERWRITE_CONTRECORD
flag.  However, logical decoding currently attempts to read the full WAL
record based on its expected size before checking this flag, which can lead
to an infinite wait if the remaining data is never written (e.g., no activity
after crash).

This patch updates the logic first to read the page header and check for
the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag before attempting to reconstruct
the full WAL record.  If the flag is set, decoding correctly identifies
the record as incomplete and avoids waiting for WAL data that will never
arrive.

Discussion: https://postgr.es/m/CAAKRu_ZCOzQpEumLFgG_%2Biw3FTa%2BhJ4SRpxzaQBYxxM_ZAzWcA%40mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm34m36PDHzsU_GdcNXU0gLTfFY5rzh9GSQv%3Dw6B%2BQVNRQ%40mail.gmail.com
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 13
src/backend/access/transam/xlogreader.c