Fixing the design of failover command propagation on watchdog cluster
Overhauling the design of how failover, failback and promote node commands are
propagated to the watchdog nodes. Previously the watchdog on pgpool-II node that
needs to perform the node command (failover, failback or promote node) used to
broadcast the failover command to all attached pgpool-II nodes. And this
sometimes makes the synchronization issues, especially when the watchdog cluster
contains a large number of nodes and consequently the failover command sometimes
gets executed by more than one pgpool-II.
Now with this commit all the node commands are forwarded to the
master/coordinator watchdog, which in turn propagates to all standby nodes.
Apart from above the commit also changes the failover command interlocking
mechanism and now only the master/coordinator node can become the lock holder
so the failover commands will only get executed on the master/coordinator node.
15 files changed: