pgpool2.git
3 years agoRevert "Fix pgpool child process to obtain process information."
Tatsuo Ishii [Sat, 9 Apr 2022 06:05:26 +0000 (15:05 +0900)]
Revert "Fix pgpool child process to obtain process information."

This reverts commit 06f69d19030deb1d72230ce489c5a4d800ad593c.

3 years agoFix pgpool child process to obtain process information.
Tatsuo Ishii [Sat, 9 Apr 2022 00:08:34 +0000 (09:08 +0900)]
Fix pgpool child process to obtain process information.

ProcesInfo was obtained by using pool_get_process_info(). But this API
is not suitable for child process because:

- does inefficient linear search over all ProcessInfo slots (there are
  num_init_children slots).

- due to race condition the search key pid might not be set or removed
  in the slot. I think it is possible that by the time when child
  process starts execution, the pid is not yet set in the slot in the
  shared memory. Also when child process is killed by parent process,
  it may set pid to 0 before the child process receive kill signal.

So add new API pool_get_process_info_by_process_id() which just
returns the slot by using global variable my_proc_id as a key and let
child process use it.  my_proc_id was set by the parent process when
the child process was spawn.

The call to pool_get_process_info() in child.c was added in v4.3. So
back patch to V4_3_STABLE.

3 years agoFix shared memory allocation function.
Tatsuo Ishii [Wed, 6 Apr 2022 07:30:35 +0000 (16:30 +0900)]
Fix shared memory allocation function.

pool_shared_memory_segment_get_chunk() which is responsible for shared
memory allocation, failed to consider request size alignment. If
requeste size is not in MAXALIGN (typically 8) bytes, it could overrun
the shared memory area. Probably harmless in the wild but better to
fix.

3 years agoFix possible null pointer dereference per Coverity.
Tatsuo Ishii [Wed, 6 Apr 2022 05:50:56 +0000 (14:50 +0900)]
Fix possible null pointer dereference per Coverity.

3 years agoRevert "Prevent hang in terminate_all_childrens()."
Tatsuo Ishii [Mon, 4 Apr 2022 07:23:44 +0000 (16:23 +0900)]
Revert "Prevent hang in terminate_all_childrens()."

This reverts commit a5e2e0411dc3ef91aafc8f4c5e1d7369e7eb3b46.

3 years agoPrevent hang in terminate_all_childrens().
Tatsuo Ishii [Fri, 1 Apr 2022 10:54:49 +0000 (19:54 +0900)]
Prevent hang in terminate_all_childrens().

waitpid() was used in the function without WNOHANG was being set.
This could cause hang in waitpic().
Also fix typo. Rename terminate_all_childrens to terminate_all_children.

3 years agoFix logging for disabled pool_passwd
Muhammad Usama [Tue, 29 Mar 2022 14:31:49 +0000 (19:31 +0500)]
Fix logging for disabled pool_passwd

Refrain from emitting 'password file descriptor is NULL' warning
and error messages when pool_passwd is disabled.

3 years agoAdd pending signal check in check_requests().
Tatsuo Ishii [Sat, 26 Mar 2022 07:32:59 +0000 (16:32 +0900)]
Add pending signal check in check_requests().

Still struggling why shutdown signal is not delivered. For this
purpose add sigpending() before release the signal mask.

3 years agoAllow shutdown interrupt while processing SIGCHILD in pgpool main.
Tatsuo Ishii [Sat, 19 Mar 2022 09:25:15 +0000 (18:25 +0900)]
Allow shutdown interrupt while processing SIGCHILD in pgpool main.

Currently most signals are blocked in pgpool main loop. In some
situations the SIGCHLD handler (reaper()) takes long time or blocked
in wait system call. I suspect that this could cause occasional
timeout in some regression tests. So allow interrupts while executing
reaper(). Also re-implement CHECK_REQUEST macro as a function. There's
no point to implement CHECK_REQUEST using macro.

3 years agoFix bug with pg_enc and pg_md5.
Tatsuo Ishii [Fri, 18 Mar 2022 11:52:07 +0000 (20:52 +0900)]
Fix bug with pg_enc and pg_md5.

When these commands are invoked with "-i" option (read username/password
pairs from a file), it did not create proper entries in pool_passwd.
This bug was introduced by the commit:
https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=441bde41767fe3bccad513735f946dd2dec5059b

Bug reported in https://www.pgpool.net/mantisbt/view.php?id=747

3 years agoEnhance 077.invalid_failover_node regression test.
Tatsuo Ishii [Tue, 15 Mar 2022 08:29:23 +0000 (17:29 +0900)]
Enhance 077.invalid_failover_node regression test.

It seems the timeout for pcp_promote_node was not long enough.
Increase it from 30 seconds to 60 seconds.

3 years agoEnhance error message while processing parse message.
Tatsuo Ishii [Sat, 12 Mar 2022 04:33:42 +0000 (13:33 +0900)]
Enhance error message while processing parse message.

In non-streaming replication mode, a sync message is sent to backend
after a parse message is sent to backend, expecting to get a ready for
query message.  If different message is returned, pgpool will complain
that. This commit adds more information: the message kind returned and
the backend node id.

3 years agoCleanup pgpool main process logging.
Tatsuo Ishii [Fri, 11 Mar 2022 04:40:28 +0000 (13:40 +0900)]
Cleanup pgpool main process logging.

Downgrade some logs for forking process to not flood the log
file. Also emphasis important events such as failover starting.

3 years agoDowngrade log level of ParameterStatus message.
Tatsuo Ishii [Fri, 11 Mar 2022 01:38:08 +0000 (10:38 +0900)]
Downgrade log level of ParameterStatus message.

In commit 4bcba5258130c3cd9f855157a4359aad2fa7acfc the log level when
ParameterStatus message arrives from backend was changed from DEBUG5
to LOG. There are multiple complains about the change because chances
of the event is more frequent than I thought. So revert back the log
level to DEBUG5.

Discussion: https://www.pgpool.net/pipermail/pgpool-general/2022-March/008101.html

3 years agoFix not to include garbage in "%m" log_line_prefix.
Tatsuo Ishii [Thu, 10 Mar 2022 10:55:10 +0000 (19:55 +0900)]
Fix not to include garbage in "%m" log_line_prefix.

When "%m" (milliseconds) is used in log_line_prefix, the millisecond
part was copied to log_line_prefix string without null termination. As
a result, sometimes garbage was included after the milliseconds part.

3 years agoImplement comma separated multiple pcp listen addresses.
Tatsuo Ishii [Sat, 5 Mar 2022 02:59:08 +0000 (11:59 +0900)]
Implement comma separated multiple pcp listen addresses.

Pgpool-II only allowed to set single hostname, IP or '*' in
pcp_listen_addresses parameter. Now we can set multiple listen
addresses to the parameter like listen_addresses.  Note that the
documentations for pcp_listen_addresses are shamelessly stolen from
PostgreSQL.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-February/004131.html

3 years agoFix main process exiting while performing finding primary node.
Tatsuo Ishii [Fri, 4 Mar 2022 11:00:41 +0000 (20:00 +0900)]
Fix main process exiting while performing finding primary node.

Pgpool-II main process tries to find primary node whenever the cluster
status is changed by failover/failback. While doing it, if a backend
is failing or shutting down, socket write to the backend could
fail. Unfortunately in the case do_query() throws FATAL error, which
makes the Pgpool-II main process die like this.

2022-03-04 18:13:12.711: main pid 795826: WARNING:  write on backend 1 failed with error :"Broken pipe"
2022-03-04 18:13:12.711: main pid 795826: DETAIL:  while trying to write data from offset: 0 wlen: 32
2022-03-04 18:13:12.711: main pid 795826: LOG:  notice_backend_error: called from pgpool main. ignored.
2022-03-04 18:13:12.711: main pid 795826: LOG:  unable to flush data to backend
2022-03-04 18:13:12.711: main pid 795826: DETAIL:  do not failover because I am the main process
2022-03-04 18:13:12.711: main pid 795826: FATAL:  Backend throw an error message
2022-03-04 18:13:12.711: main pid 795826: DETAIL:  Exiting current session because of an error from backend
2022-03-04 18:13:12.711: main pid 795826: HINT:  BACKEND Error: "terminating connection due to administrator command"
2022-03-04 18:13:12.715: main pid 795826: LOG:  shutting down

To prevent it, change ereport(FATAL) to ereport(ERROR) in do_query().

3 years agoFix: [pgpool-general: 8030] ... segfaults on CentOS 8
Muhammad Usama [Thu, 3 Mar 2022 10:25:31 +0000 (15:25 +0500)]
Fix: [pgpool-general: 8030] ... segfaults on CentOS 8

Event names array used by debug messages had a missing
entry for WD_EVENT_I_AM_APPEARING_FOUND

3 years agoEnhance regression test 077.invalid_failover_node.
Tatsuo Ishii [Tue, 1 Mar 2022 08:35:54 +0000 (17:35 +0900)]
Enhance regression test 077.invalid_failover_node.

Print starting time to see if timeout is suitable for this test.
Also print relative time from the start of retry loop.

3 years agoChange the default value of pcp_listen_addresses from '*' to 'localhost'.
Tatsuo Ishii [Mon, 28 Feb 2022 07:16:18 +0000 (16:16 +0900)]
Change the default value of pcp_listen_addresses from '*' to 'localhost'.

Sync with compiled default.

3 years agoChange the default value of pcp_listen_addresses from '*' to 'localhost'.
Tatsuo Ishii [Mon, 28 Feb 2022 06:52:11 +0000 (15:52 +0900)]
Change the default value of pcp_listen_addresses from '*' to 'localhost'.

'*' was not very secure. Also the default value of listen_address in
Pgpool-II and PostgreSQL are 'localhost'.

3 years agoMore tweak regression test 077.invalid_failover_node/test.sh script.
Tatsuo Ishii [Sun, 27 Feb 2022 23:50:56 +0000 (08:50 +0900)]
More tweak regression test 077.invalid_failover_node/test.sh script.

Previous commit was not quite correct and actually the timeout was to
not increased.

3 years agoDoc: update copyright year.
Bo Peng [Sun, 27 Feb 2022 03:36:16 +0000 (12:36 +0900)]
Doc: update copyright year.

3 years agoMore tweak regression test 077.invalid_failover_node/test.sh script.
Tatsuo Ishii [Fri, 25 Feb 2022 23:19:34 +0000 (08:19 +0900)]
More tweak regression test 077.invalid_failover_node/test.sh script.

Build farm complained that failover on node 1 did not complete with 20
seconds.  Let's increase it to 30 seconds before timeout and see what
build farm says.

3 years agoFixed follow_primary.sh.sample script to check the status of PostgreSQL using pg_isready.
Bo Peng [Fri, 25 Feb 2022 05:29:02 +0000 (14:29 +0900)]
Fixed follow_primary.sh.sample script to check the status of PostgreSQL using pg_isready.

3 years agoMore tweak regression test 077.invalid_failover_node/test.sh script.
Tatsuo Ishii [Thu, 24 Feb 2022 23:45:14 +0000 (08:45 +0900)]
More tweak regression test 077.invalid_failover_node/test.sh script.

Build farm complained that failover on node 1 did not complete with 5
seconds.  Let's increase it to 20 seconds before timeout and see what
build farm says.

3 years agoAllow to specify duplicated entry in listen_addresses.
Tatsuo Ishii [Thu, 24 Feb 2022 08:27:15 +0000 (17:27 +0900)]
Allow to specify duplicated entry in listen_addresses.

If duplicated entry exists in listen_addresses, pgpool did not
start. On the other hand PostgreSQL just complains and ignores the
duplicated entry.  So this commit follow the behavior of PostgreSQL.
Also PostgreSQL ignores wrong host name or IP. So pgpool follows it
too.

Plus fix comment for listen_addresses in pool_config_variable.c so
that it uses plural (i.e. hotsname(s)).

3 years agoTweak regression test 077.invalid_failover_node/test.sh script.
Tatsuo Ishii [Wed, 23 Feb 2022 12:01:51 +0000 (21:01 +0900)]
Tweak regression test 077.invalid_failover_node/test.sh script.

Build farm complained that pcp_promote_node did not complete with 5
seconds.  Let's increase it to 20 seconds before timeout and see what
build farm says.

3 years agoImplement comma separated multiple listen addresses.
Tatsuo Ishii [Wed, 23 Feb 2022 03:29:44 +0000 (12:29 +0900)]
Implement comma separated multiple listen addresses.

Pgpool-II only allowed to set single hostname, IP or '*' to the
listen_addresses parameter. Now we can set multiple listen addresses
to the parameter as PostgreSQL already does.  Note that the
documentations for listen_addresses are shamelessly stolen from
PostgreSQL.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-February/004131.html

3 years agoAdd regression test case for testing "invalid degenerate backend request".
Tatsuo Ishii [Tue, 22 Feb 2022 02:04:42 +0000 (11:04 +0900)]
Add regression test case for testing "invalid degenerate backend request".

This is the test for commit: 214e0017

3 years agoAdd patch to enable parameters related to logging_collector.
Bo Peng [Tue, 22 Feb 2022 04:38:51 +0000 (13:38 +0900)]
Add patch to enable parameters related to logging_collector.

3 years agoFix invalid degenerate backend request problem.
Tatsuo Ishii [Tue, 22 Feb 2022 01:42:09 +0000 (10:42 +0900)]
Fix invalid degenerate backend request problem.

When health check process sends failover request, it fails in rare
cases with message: "invalid degenerate backend request , node id : 2
status: [2] is not valid for failover". This is caused if backend node
status managed in private_backend_status and the one in the share
memory area do not agree. private_backend_status is initialized upon
starting up of process. It's not updated during process's life
cycle. Usually this is ok, but for example consider following
scenario:

(1) When pgpool starts, node 1 is down in pgpool_status. So health
check process did not start for node 1.

(2) pcp_promote_node --switchover gets called. Health check process
for node 1 starts and private_backend_status for node 1 remains down.

(3) Node 1 is back to online by follow master command.

(4) Node 1 is shutdown.

(5) The health check process detects node 1 is down and requests
failover. But since private_backend_status is down, it is refused with
the message.

To fix this, we can simply delete the call to
pool_initialize_private_backend_status() at the process start
up. Originally the intention for private_backend_status() is that
pgpool child process is not bothered by the status change in the
middle of process. This is not necessary for health check and
streaming replication check.

Note that I was not able to find a scenario for prior 4.3. Once I
find, I will back patch this to pre 4.3 branches.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-February/004128.html

3 years agoAdd release notes.
Masaya Kawamoto [Thu, 17 Feb 2022 00:09:10 +0000 (00:09 +0000)]
Add release notes.

3 years agoFixed the streaming replication check process not to retry if it cannot connect to...
Bo Peng [Thu, 10 Feb 2022 02:02:38 +0000 (11:02 +0900)]
Fixed the streaming replication check process not to retry if it cannot connect to the backend.

If the backend takes so long to respond, the connection times out,
then the streaming replication check process will continue to retry.
This retry causes a long time failover.

3 years agoAdd validations of wd_lifecheck_password and recovery_password format
Masaya Kawamoto [Thu, 10 Feb 2022 01:39:52 +0000 (01:39 +0000)]
Add validations of wd_lifecheck_password and recovery_password format

This feature was reverted once due to regression
test failure by dea2fbf65fdb3250f825e20f20fc3081779d8a3e.

3 years agoFix missed static declaration.
Tatsuo Ishii [Wed, 9 Feb 2022 23:45:43 +0000 (08:45 +0900)]
Fix missed static declaration.

The static declaration of fork_follow_child() was missing. Also fix
typo in a comment.

3 years agoRemove ifdef out code.
Tatsuo Ishii [Tue, 8 Feb 2022 05:56:13 +0000 (14:56 +0900)]
Remove ifdef out code.

Ancient health check is no longer needed to be preserved.

3 years agoRefactor failover().
Tatsuo Ishii [Tue, 8 Feb 2022 05:26:29 +0000 (14:26 +0900)]
Refactor failover().

failover() was too large and hard to maintain. By refactoring it, the
size is reduced from 798 lines to 215 lines.  It is now splitted into
following subroutines. failover() just calls them.

static int handle_failback_request(FAILOVER_CONTEXT *failover_context, int node_id);
static int handle_failover_request(FAILOVER_CONTEXT *failover_context, int node_id);
static void kill_failover_children(FAILOVER_CONTEXT *failover_context, int node_id);
static void exec_failover_command(FAILOVER_CONTEXT *failover_context, int new_main_node_id, int promote_node_id);
static int determine_new_primary_node(FAILOVER_CONTEXT *failover_context, int node_id);
static int exec_follow_primary_command(FAILOVER_CONTEXT *failover_context, int node_id, int new_primary_node_id);
static void save_node_info(FAILOVER_CONTEXT *failover_context, int new_primary_node_id, int new_main_node_id);
static void exec_child_restart(FAILOVER_CONTEXT *failover_context, int node_id);
static void exec_notice_pcp_child(FAILOVER_CONTEXT *failover_context);

3 years agoFixed mistakes introduced in the previous commit.
Bo Peng [Mon, 7 Feb 2022 03:19:01 +0000 (12:19 +0900)]
Fixed mistakes introduced in the previous commit.

3 years agoFixed sample failover script.
Bo Peng [Mon, 7 Feb 2022 03:06:45 +0000 (12:06 +0900)]
Fixed sample failover script.

This script did not consider the case when the old primary node id is "-1".

3 years agoFix failover() to deal with the case when no former primary node exists.
Tatsuo Ishii [Sun, 6 Feb 2022 08:11:52 +0000 (17:11 +0900)]
Fix failover() to deal with the case when no former primary node exists.

Consider a case when no primary node exists when Pgpool-II starts
up. In this case Req_info->primary_node_id is -1. failover() did not
consider this and skipped to call find_primary_node_repeatedly().
Also follow_master_command was not executed if
Req_info->primary_node_id is -1.

This commit fixes the bug above.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-February/004114.html

3 years agoFix pgpool_setup in failover scrip creation.
Tatsuo Ishii [Sun, 6 Feb 2022 07:23:46 +0000 (16:23 +0900)]
Fix pgpool_setup in failover scrip creation.

When pgpool_set creates failover.sh, it did not consider the case when
no primary server existed.

3 years agoAdd restriction about set_config.
Tatsuo Ishii [Wed, 2 Feb 2022 06:46:50 +0000 (15:46 +0900)]
Add restriction about set_config.

3 years agoFix integer overflow problem in streaming delay check worker process.
Tatsuo Ishii [Wed, 2 Feb 2022 02:18:55 +0000 (11:18 +0900)]
Fix integer overflow problem in streaming delay check worker process.

Per Coverity. Also use uint64 and UINT64_FORMAT for consistency.

3 years agoFix memory leak pointed out by Coverity.
Tatsuo Ishii [Wed, 2 Feb 2022 01:19:26 +0000 (10:19 +0900)]
Fix memory leak pointed out by Coverity.

Actually it's a false positive.

3 years agoFix health check process issues pointed out by Coverity.
Tatsuo Ishii [Wed, 2 Feb 2022 01:04:06 +0000 (10:04 +0900)]
Fix health check process issues pointed out by Coverity.

Fix possible NULL terminate missing and memory leak when running in test mode.

3 years agoAdjusting the field name in pcp_watchdog_info.
Muhammad Usama [Tue, 1 Feb 2022 13:57:13 +0000 (18:57 +0500)]
Adjusting the field name in pcp_watchdog_info.

Details in: https://www.pgpool.net/pipermail/pgpool-hackers/2021-December/004070.html

3 years agoEnhance parameter status handling.
Tatsuo Ishii [Mon, 31 Jan 2022 08:11:02 +0000 (17:11 +0900)]
Enhance parameter status handling.

When a parameter status message arrives from backend, Pgpool-II
memorized it but did not forward to frontend.  This commit allows
forwarding a parameter status message to frontend.

3 years agoRevert changes accidentally included in commit f9521fe4.
Tatsuo Ishii [Mon, 31 Jan 2022 07:54:44 +0000 (16:54 +0900)]
Revert changes accidentally included in commit f9521fe4.

Commit f9521fe4 (Implement flush tracking feature) accidentally
included changes for Parameter status change fix patch proposed in
[pgpool-hackers: 4103] Re: What to do with ParamterStatus?
https://www.pgpool.net/pipermail/pgpool-hackers/2022-January/004104.html

A commit should only include single change. Unrelated changes should
not brought in together. So revert the part (Parameter status change
fix).

3 years agoFix long standing bug with pcp_node_info.
Tatsuo Ishii [Mon, 31 Jan 2022 02:45:33 +0000 (11:45 +0900)]
Fix long standing bug with pcp_node_info.

It appears that occasionally pcp_node_info shows backend_status field
as "quarantine" when it should be "down". While pcp_node_info shows
the status, first it checks the backend_status member in BackendInfo
struct. If it is 3, then checks quarantine member. If it is other than
0, then the backend_status field is shown as "quarantine". So if
garbage remains in quarantine member, it is shown as "quarantine". The
BackendInfo struct is transferred from pcp_worker process to pcp
frontend client. Unfortunately when the quarantine member was added by
commit 54af632c, it was forgotten to modify pcp_worker.c and
pcp_frontend.c so that the "quarantine" member is transferred.

Fix is needed to be back patched to 3.7, when the "quarantine" member
was first added.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-January/004110.html

3 years agoFix comment in libpcp_ext.h.
Tatsuo Ishii [Sun, 30 Jan 2022 09:58:20 +0000 (18:58 +0900)]
Fix comment in libpcp_ext.h.

Fix comment for standby_delay_by_time. In the comment unit used in
standby_delay is microseconds, not milliseconds when
standby_delay_by_time is true.

3 years agoChange the way to obtain replication delay when delay_threshold_by_time is specified.
Tatsuo Ishii [Sat, 29 Jan 2022 07:59:26 +0000 (16:59 +0900)]
Change the way to obtain replication delay when delay_threshold_by_time is specified.

Use pg_stat_replication.replay_lag.  This way makes the code much
simpler and more precise replication delay can be obtained. The only
downside is pg_stat_replication.replay_lag is only available in
PostgreSQL 10 or later (previous method can be used in 9.5 or later).
I think the down side is not worth the trouble and we should use
pg_stat_replication.replay_lag.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-January/004109.html

3 years agoAllow to specify replication delay by time.
Tatsuo Ishii [Fri, 28 Jan 2022 01:39:21 +0000 (10:39 +0900)]
Allow to specify replication delay by time.

delay_threshold specifies replication delay upper limit in bytes. Add
similar parameter called delay_threshold_by_time so that the limit can
be specified by time (seconds). The new parameter is effective if it is
greater than 0 and track_commit_timestamp (available in >> PostgreSQL
9.5 or after) is enabled. In this case "show pool_ndoes" and
pcp_node_info display the replication delay in seconds. If the
parameter is set to 0 or track_commit_timestamp is not enabled,
delay_threshold_by_time is ignored and falls back to delay_threshold
mode.

For this purpose new member standby_delay_by_time is added to shared
memory data Backendinfo to distinguish whether replication delay is
measured in byte (standby_delay_by_time == false) or seconds
(standby_delay_by_time = true). If standby_delay_by_time is true,
standby_delay is measured in second * 1000000, so that the precision
is 6 digits after decimal point.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2021-December/004081.html

3 years agoFix pcp_node_info hang when pgpool cannot connect to backend.
Tatsuo Ishii [Wed, 26 Jan 2022 23:12:09 +0000 (08:12 +0900)]
Fix pcp_node_info hang when pgpool cannot connect to backend.

Since 4.3 pcp_node_info (and show pool_nodes) try to connect to all
backend to obtain real-time and actual backend status.  In certain
cases connect(2) fails and keeps on retrying, and the command does not
complete. To fix this, following modifications are made:

1) db_node_status(), which is responsible for probing backend is alive
or not, tries to connect backend with "connect_timeout" parameter
enabled. The timeout is taken from "connect_timeout" parameter of
pgpool.conf. If connect_timeout in pgpool.conf disabled (i.e. set to
0), the timeout parameter is not set. In this case the command will
not complete.

2) db_node_role(), which is responsible for fetching backend role (primary
or standby), does to not retry in connecting to backend using
make_persistent_db_connection_noerror().

3) inform_node_info(), which is the workhorse of pcp_node_info, tries
to fetch info (thus calling db_node_status() and db_node_role()) only
for specified backend. Before it unconditionally access all backend.

Problem reported and patch reviewed by Emond Papegaaij.

Discussion: https://www.pgpool.net/pipermail/pgpool-general/2022-January/008042.html

3 years agoAdd an extended query protocol test for flush tracking.
Tatsuo Ishii [Tue, 18 Jan 2022 05:52:47 +0000 (14:52 +0900)]
Add an extended query protocol test for flush tracking.

3 years agoImplement flush tracking feature.
Tatsuo Ishii [Tue, 18 Jan 2022 05:44:49 +0000 (14:44 +0900)]
Implement flush tracking feature.

When a flush message arrives from frontend, any pending message from
backend should be flushed and sent to frontend. In order to do that,
this commit implements "flush tracking" feature. i.e. when a flush
message arrives, pgpool sets "flush pending" flag in each pending
messages. If the response message from backend corresponds to the
pending message with the flush pending flag being set, the message is
immediately flushed to frontend, rather than buffered.

Discussion: https://www.pgpool.net/pipermail/pgpool-general/2022-January/008026.html

3 years agoEnhance pgproto.
Tatsuo Ishii [Mon, 17 Jan 2022 23:58:38 +0000 (08:58 +0900)]
Enhance pgproto.

Allow to show ParameterStatus's parameter name and value.

3 years agoChange the default value for log_line_prefix.
Tatsuo Ishii [Fri, 14 Jan 2022 07:35:31 +0000 (16:35 +0900)]
Change the default value for log_line_prefix.

Currently the default value for log_line_prefix is "%t pid: %p", which
is the long standing default. Also since 4.2, %a (application name) and
since 4.3 %m (timestamp with millisecond) are available.

This commit changes %t" (timestamp without millisecond) to %m because
PostgreSQL's default is already %m.  Also add %a so that admins easily
distinguish particular process's log.

In summary the default is changed from:
"%t pid: %p"
to:
"%m %a pid: %p"

This change will be applied to not only master branch but 4.3
stable. There's not so many 4.3 users yet and changing the default
will not affect existing 4.3 users. Only new 4.3 will enjoy the new
default.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-January/004098.html

3 years agoFix for a small mistake in pgpool-recovery SQL script
Muhammad Usama [Thu, 13 Jan 2022 07:59:24 +0000 (12:59 +0500)]
Fix for a small mistake in pgpool-recovery SQL script

3 years agoFix bug of wd_no_show_node_removal_timeout.
Bo Peng [Thu, 13 Jan 2022 06:00:26 +0000 (15:00 +0900)]
Fix bug of wd_no_show_node_removal_timeout.

Rename "wd_initial_node_showup_time" to "wd_no_show_node_removal_timeout".

3 years agoTest: fix pgpool_setup and watchdog_setup binary PATH in noinstall mode.
Bo Peng [Thu, 13 Jan 2022 05:34:22 +0000 (14:34 +0900)]
Test: fix pgpool_setup and watchdog_setup binary PATH in noinstall mode.

3 years agoUpdate Makefile.in
Bo Peng [Wed, 12 Jan 2022 10:54:07 +0000 (19:54 +0900)]
Update Makefile.in

3 years agoFix regression test 075.
Tatsuo Ishii [Wed, 12 Jan 2022 10:42:01 +0000 (19:42 +0900)]
Fix regression test 075.

The test reported success even if pgpool does not start up.
Problem reported and patch provided by Qiang Lingjie.
Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2022-January/004086.html

3 years agoFix bug of pgpool_remote_start.sample.
Bo Peng [Wed, 12 Jan 2022 10:15:00 +0000 (19:15 +0900)]
Fix bug of pgpool_remote_start.sample.

3 years agoDoc: enhance pgproto.
Tatsuo Ishii [Wed, 12 Jan 2022 05:07:05 +0000 (14:07 +0900)]
Doc: enhance pgproto.

Add documents for the previous commit.

3 years agoEnhance pgproto.
Tatsuo Ishii [Wed, 12 Jan 2022 05:02:27 +0000 (14:02 +0900)]
Enhance pgproto.

Add new command 'z'. This is similar to 'y' except 'z' reads only 1
message and do not wait for "ready for query" arrives (or timeout if
no message arrives within 1 second).

3 years agoFix compiler warning.
Tatsuo Ishii [Sat, 8 Jan 2022 03:59:49 +0000 (12:59 +0900)]
Fix compiler warning.

In the commit "Suppress message length log for in_hot_standby." I
forgot to push a modification to
src/include/protocol/pool_proto_modules.h which caused a compiler
warning.

3 years agoDoc: add restriction regarding ParameterStatus and in_hot_standby parameter.
Tatsuo Ishii [Fri, 7 Jan 2022 01:07:55 +0000 (10:07 +0900)]
Doc: add restriction regarding ParameterStatus and in_hot_standby parameter.

3 years agoSuppress message length log for in_hot_standby.
Tatsuo Ishii [Wed, 5 Jan 2022 04:50:59 +0000 (13:50 +0900)]
Suppress message length log for in_hot_standby.

PostgreSQL 14 introduced new config parameter: in_hot_standby
https://www.postgresql.org/docs/14/runtime-config-preset.html
The value is either "on" for standby servers or "off" for primary
servers. As a result pgpool log is fladded by the messages:

2021-12-16 10:40:34.855: psql pid 366965: LOG:  reading message length
2021-12-16 10:40:34.855: psql pid 366965: DETAIL:  message length (22) in slot 1 does not match with slot 0(23)

To avoid this, only complain if the parameter name is not in_hot_standby.
Also the message is enhanced to show the parameter name.

2022-01-05 13:05:15.993: psql pid 642877: LOG:  ParameterStatus "TimeZone": node 1 message length 30 is different from main node message length 24

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2021-December/004077.html

3 years agoDoc: add "exclude" settings to /etc/yum.repos.d/pgdg-redhat-all.repo so that Pgpool...
Bo Peng [Tue, 4 Jan 2022 07:43:44 +0000 (16:43 +0900)]
Doc: add "exclude" settings to /etc/yum.repos.d/pgdg-redhat-all.repo so that Pgpool-II is not installed from PostgreSQL YUM repository.

3 years agoDoc: fix documentation typos.
pengbo [Tue, 4 Jan 2022 05:34:19 +0000 (14:34 +0900)]
Doc: fix documentation typos.

Patch is created by Umar Hayat.

3 years agoDoc: fix release notes.
Masaya Kawamoto [Thu, 23 Dec 2021 07:36:30 +0000 (07:36 +0000)]
Doc: fix release notes.

3 years agoAdd release notes.
Masaya Kawamoto [Wed, 22 Dec 2021 08:57:53 +0000 (08:57 +0000)]
Add release notes.

3 years agoAllow to run regression test against existing installation without recompiling.
Tatsuo Ishii [Wed, 22 Dec 2021 01:13:20 +0000 (10:13 +0900)]
Allow to run regression test against existing installation without recompiling.

It is possible to run regression test using existing installation.

regression.sh -m noinstall

However some of tests fail in this case because those tests require
pgpool to be compiled with variable HEALTHCHECK_DEBUG is set. This is
only possible in following procedure.

make clean
cd src/test/regression
./regress.sh

To run the regression test against existing installation new config
variable "health_check_test" is added. The source code is always
compiled as if HEALTHCHECK_DEBUG is set. The test facility is not
activated unless health_check_test is set to on.

For now I push to only master branch. After some tests, I am going to
push to all supported branches. I know adding new parameter to stable
branches is unusual but the feature is for enhancing test and it is
not visible to ordinal users. So I think my plan is justified by them.

Discussion: https://www.pgpool.net/pipermail/pgpool-hackers/2021-December/004078.html

3 years agoRevert "Add validations of wd_lifecheck_password and recovery_password format"
Tatsuo Ishii [Fri, 10 Dec 2021 23:51:27 +0000 (08:51 +0900)]
Revert "Add validations of wd_lifecheck_password and recovery_password format"

This reverts commit 0f587d1741f20140ea8a2293b40946de27aec736.

This commit caused failure in regression test due to pcp_recovery_node_error:
recovery node 1...ERROR:  invalid password format for recovery_user: t-ishii
DETAIL:  md5 hashed password is not allowed here

3 years agoAdd validations of wd_lifecheck_password and recovery_password format
Masaya Kawamoto [Fri, 10 Dec 2021 04:26:45 +0000 (04:26 +0000)]
Add validations of wd_lifecheck_password and recovery_password format

wd_lifecheck_password and recovery_password are not allowed to be md5
hashed password format but pgpool did not check them.

3 years agoDoc: sync with English manual.
Tatsuo Ishii [Thu, 9 Dec 2021 23:25:29 +0000 (08:25 +0900)]
Doc: sync with English manual.

3 years agoDoc: fix typo in pcp_watchdog_info manual.
Tatsuo Ishii [Thu, 9 Dec 2021 23:24:27 +0000 (08:24 +0900)]
Doc: fix typo in pcp_watchdog_info manual.

3 years agoDoc: fix typos
Bo Peng [Mon, 6 Dec 2021 12:00:24 +0000 (21:00 +0900)]
Doc: fix typos

Patch is created by Lu Chenyang.

3 years agoDoc: update japanese configuration example for SR mode.
Masaya Kawamoto [Mon, 6 Dec 2021 11:29:47 +0000 (11:29 +0000)]
Doc: update japanese configuration example for SR mode.

3 years agoDoc: update doc version.
Bo Peng [Mon, 6 Dec 2021 05:46:08 +0000 (14:46 +0900)]
Doc: update doc version.

3 years agoUpdate Makefile.
Bo Peng [Mon, 6 Dec 2021 02:35:44 +0000 (11:35 +0900)]
Update Makefile.

3 years agoUpdate Makefile.
Bo Peng [Mon, 6 Dec 2021 01:49:04 +0000 (10:49 +0900)]
Update Makefile.

3 years agoDoc: update docs.
Bo Peng [Mon, 6 Dec 2021 01:32:50 +0000 (10:32 +0900)]
Doc: update docs.

3 years agoDoc: add mention about add native/SI mode example to 4.3 release note.
Tatsuo Ishii [Mon, 6 Dec 2021 00:18:43 +0000 (09:18 +0900)]
Doc: add mention about add native/SI mode example to 4.3 release note.

3 years agoRename recovery_2nd_stage.sample to replication_mode_recovery_2nd_stage.sample
Bo Peng [Sun, 5 Dec 2021 17:35:03 +0000 (02:35 +0900)]
Rename recovery_2nd_stage.sample to replication_mode_recovery_2nd_stage.sample

3 years agoUpdate pgpool.spec.
Bo Peng [Sun, 5 Dec 2021 17:33:29 +0000 (02:33 +0900)]
Update pgpool.spec.

3 years agoDoc: add new configuration example for replication mode and si mode.
Bo Peng [Sun, 5 Dec 2021 17:26:41 +0000 (02:26 +0900)]
Doc: add new configuration example for replication mode and si mode.

3 years agoDoc: update documentation "Pgpool-II + Watchdog Setup Example".
Bo Peng [Sun, 5 Dec 2021 16:55:34 +0000 (01:55 +0900)]
Doc: update documentation "Pgpool-II + Watchdog Setup Example".

3 years agoSuppress bison warnings regarding yacc incompatibility.
Tatsuo Ishii [Fri, 3 Dec 2021 05:40:23 +0000 (14:40 +0900)]
Suppress bison warnings regarding yacc incompatibility.

Run bison without yacc compatibility may raise some risks, so just suppress warnings.

3 years agoDoc: fix typos.
Bo Peng [Thu, 25 Nov 2021 06:19:45 +0000 (15:19 +0900)]
Doc: fix typos.

3 years agoAdd 4.3RC1 release notes.
Bo Peng [Wed, 24 Nov 2021 13:43:41 +0000 (22:43 +0900)]
Add 4.3RC1 release notes.

3 years agoFix mention about insert_lock in 4.3 release note.
Tatsuo Ishii [Tue, 23 Nov 2021 23:40:20 +0000 (08:40 +0900)]
Fix mention about insert_lock in 4.3 release note.

The compile default was changed twice while preparing 4.3.  First, on
to off, then finally off to on. To avoid the confusion, remove "on to
off" part.

3 years agoFix redundant code.
Tatsuo Ishii [Mon, 22 Nov 2021 07:31:39 +0000 (16:31 +0900)]
Fix redundant code.

Patch contributed by Lu Chenyang.

3 years agoAdd release notes for Pgpool-II 4.2.6, 4.3beta2.
Masaya Kawamoto [Thu, 18 Nov 2021 09:04:20 +0000 (09:04 +0000)]
Add release notes for Pgpool-II 4.2.6, 4.3beta2.

3 years agoReject extraneous data after SSL encryption handshake.
Tatsuo Ishii [Wed, 17 Nov 2021 10:26:11 +0000 (19:26 +0900)]
Reject extraneous data after SSL encryption handshake.

In the server side implementation of SSL negotiation
(pool_ssl_negotiate_serverclient()), it was possible for a
man-in-the-middle attacker to inject arbitrary SQL commands. This is
possible if Pgpool-II is configured to use cert authentication or
hostssl + trust. This resembles PostgreSQL's CVE-2021-23214.

Similarly, in the client side implementation of SSL negotiation
(pool_ssl_negotiate_clientserver()), it was possible for a
man-in-the-middle attacker to inject arbitrary responses. This is
possible if PostgreSQL is using trust authentication with a clientcert
requirement. It is not possible with cert authentication because
Pgpool-II does not implement the cert authentication between Pgpool-II

To fix these reject extraneous data in the read buffer after SSL
encryption handshake.
and PostgreSQL. This resembles PostgreSQL's CVE-2021-23222.

3 years agoDeal with PostgreSQL 14 while processing pg_terminate_backend().
Tatsuo Ishii [Tue, 16 Nov 2021 00:45:31 +0000 (09:45 +0900)]
Deal with PostgreSQL 14 while processing pg_terminate_backend().

Do not reject two arguments form of pg_terminate_backend() as
PostgreSQL 14 or after accept two arguments.

3 years agoFix occasional 073.pg_terminate_backend regression test failure.
Tatsuo Ishii [Tue, 16 Nov 2021 00:02:47 +0000 (09:02 +0900)]
Fix occasional 073.pg_terminate_backend regression test failure.

The test used "ps -ef" command to find the process which is running
SELECT command.  However in some cases the "ps -ef" command omits part
of "SELECT" in its output and this made the test fail.
So use "ps -efw" instead of "ps -ef" to prevent it.

3 years agoDoc: update 4.3 release note.
Tatsuo Ishii [Tue, 9 Nov 2021 01:27:36 +0000 (10:27 +0900)]
Doc: update 4.3 release note.

Suggestion from Muhammad Usama.

3 years agoDoc: add Japanese translation for commit a683288045332fdd8ea7cd7038510353cb3dbfc
Tatsuo Ishii [Mon, 8 Nov 2021 08:14:01 +0000 (17:14 +0900)]
Doc: add Japanese translation for commit a683288045332fdd8ea7cd7038510353cb3dbfc