Fix watchdog leader sync process to start health check process.
authorTatsuo Ishii <ishii@sraoss.co.jp>
Mon, 8 Feb 2021 11:30:18 +0000 (20:30 +0900)
committerTatsuo Ishii <ishii@sraoss.co.jp>
Mon, 8 Feb 2021 11:36:30 +0000 (20:36 +0900)
When watchdog receives status change request from other watchdog node
and calls sync_backend_from_watchdog() to sync with status of leader
node, it forgot to start health check process. For example,

1) initial pgpool_status file indicates DB node 1 is down.
2) pgpool starts up but only starts health check process for DB node 0
   because node 1 is in down status.
3) pcp_attach_node is issued to other than leader pgpool node.
4) leader node updates the node status for DB node 1 and other node
   syncs the status. Since sync_backend_from_watchdog() does not start
   health check process, only on pgpool leader node starts health
   check process but other nodes do not.

To fix this starts health check process if necessary in
sync_backend_from_watchdog().

src/main/pgpool_main.c

index ed1bfef9be5dd5798baeeefd28ecbce95bfd237d..c91db22d79a83001394ddbd5d41d037acde19243 100644 (file)
@@ -2000,8 +2000,8 @@ failover(void)
                                {
                                        ereport(LOG,
                                                        (errmsg("start health check process for host %s(%d)",
-                                                                       BACKEND_INFO(node_id).backend_hostname,
-                                                                       BACKEND_INFO(node_id).backend_port)));
+                                                                       BACKEND_INFO(i).backend_hostname,
+                                                                       BACKEND_INFO(i).backend_port)));
 
                                        health_check_pids[i] = worker_fork_a_child(PT_HEALTH_CHECK, do_health_check_child, &i);
                                }
@@ -4123,6 +4123,20 @@ sync_backend_from_watchdog(void)
         * Send restart request to worker child.
         */
        kill(worker_pid, SIGUSR1);
+
+       /* Fork health check process if needed */
+       for (i = 0; i < NUM_BACKENDS; i++)
+       {
+               if (health_check_pids[i] == 0)
+               {
+                       ereport(LOG,
+                                       (errmsg("start health check process for host %s(%d)",
+                                                       BACKEND_INFO(i).backend_hostname,
+                                                       BACKEND_INFO(i).backend_port)));
+
+                       health_check_pids[i] = worker_fork_a_child(PT_HEALTH_CHECK, do_health_check_child, &i);
+               }
+       }
 }
 
 /*