From 3c09ac1d40315e9c67b73355073533e17d30ce81 Mon Sep 17 00:00:00 2001 From: Tatsuo Ishii Date: Thu, 9 Jul 2020 09:11:03 +0900 Subject: [PATCH] Fix pgpool hang in a corner case. It is possible that an "out of band" message from backend has been read into buffer at the time when a ready for query message is processed. If the messages are from all backends, there should be no problem because ProcessBackendResponse() will read the messages from all backends by using read_kind_from_backend(). However there could be a corner case: 1) If the message is coming from only one of backend (this could happen when recovery conflict or backend receiving SIGTERM and so on) and 2) the message is already in the backend read buffer. In this case pgpool will hang in pool_read() called by read_kind_from_backend() at either: 1) read_kind_from_one_backend(frontend, backend, (char *) &kind, MASTER_NODE_ID) (the message is not coming from master backend) or 2) pool_read(CONNECTION(backend, i), &kind, 1) (the message is not coming from other than master). Note If the message is not in the buffer, there should be no problem since read_packets_and_process() will take care that "out of band" messages. The solution is, read and discard such a message in ReadyforQuery(), emitting log to make sure that the read buffer is empty after returning from ReadyForQuery(). (remember that unless the ready for query message is returned to frontend, the frontend will not issue next query and there's should be no response from backend except the out of band messages). If the message was FATAL, the backend will disconnect to pgpool. So next time pgpool should notice that the connection is closed anyway. For the master branch, probably we should treat that kind of FATAL message in a same way as read_packets_and_process() already does. This requires some code refactoring and I would like to leave the job separated from this commit. --- src/protocol/pool_proto_modules.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/src/protocol/pool_proto_modules.c b/src/protocol/pool_proto_modules.c index da9addcbe..beb2056ea 100644 --- a/src/protocol/pool_proto_modules.c +++ b/src/protocol/pool_proto_modules.c @@ -2073,6 +2073,27 @@ ReadyForQuery(POOL_CONNECTION * frontend, } } + /* + * Make sure that no message remains in the backend buffer. If something + * remains, it could be an "out of band" ERROR or FATAL error, or a NOTICE + * message, which was generated by backend itself for some reasons like + * recovery conflict or SIGTERM received. If so, let's consume it and emit + * a log message so that next read_kind_from_backend() will not hang in + * trying to read from backend which may have not produced such a message. + */ + if (pool_is_query_in_progress()) + { + for (i = 0; i < NUM_BACKENDS; i++) + { + if (!VALID_BACKEND(i)) + continue; + if (!pool_read_buffer_is_empty(CONNECTION(backend, i))) + per_node_error_log(backend, i, + "(out of band message)", + "ReadyForQuery: Error or notice message from backend: ", false); + } + } + if (send_ready) { pool_write(frontend, "Z", 1); -- 2.39.5