git.postgresql.org Git - users/andresfreund/postgres.git/commit

author	Tom Lane <tgl@sss.pgh.pa.us>
	Thu, 3 Sep 2020 20:52:09 +0000 (16:52 -0400)
committer	Tom Lane <tgl@sss.pgh.pa.us>
	Thu, 3 Sep 2020 20:52:09 +0000 (16:52 -0400)
commit	be4b0c0077e6a1f7be0965f8d93696e0e0eadb52
tree	0e563f9eeafc6351bd02b3bd84b9976f02a70d48	tree
parent	8f8154a503c71a18ad72c64f4aefb9d847c45b86	commit \| diff

Avoid lockup of a parallel worker when reporting a long error message.

Because sigsetjmp() will restore the initial state with signals blocked,
the code path in bgworker.c for reporting an error and exiting would
execute that way. Usually this is fairly harmless; but if a parallel
worker had an error message exceeding the shared-memory communication
buffer size (16K) it would lock up, because it would wait for a
resume-sending signal from its parallel leader which it would never
detect.

To fix, just unblock signals at the appropriate point.

This can be shown to fail back to 9.6. The lack of parallel query
infrastructure makes it difficult to provide a simple test case for
9.5; but I'm pretty sure the issue exists in some form there as well,
so apply the code change there too.

Vignesh C, reviewed by Bharath Rupireddy, Robert Haas, and myself

Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com

src/backend/postmaster/bgworker.c		diff \| blob \| blame \| history
src/test/regress/expected/select_parallel.out		diff \| blob \| blame \| history
src/test/regress/sql/select_parallel.sql		diff \| blob \| blame \| history