<programlisting>
# pcp_watchdog_info -h 192.168.137.150 -p 9898 -U pgpool
Password:
-3 YES server1:9999 Linux server1 server1
+3 3 YES server1:9999 Linux server1 server1
-server1:9999 Linux server1 server1 9999 9000 4 LEADER #The Pgpool-II server started first became "LEADER".
-server2:9999 Linux server2 server2 9999 9000 7 STANDBY #run as standby
-server3:9999 Linux server3 server3 9999 9000 7 STANDBY #run as standby
+server1:9999 Linux server1 server1 9999 9000 4 LEADER 0 MEMBER #The Pgpool-II server started first became "LEADER".
+server2:9999 Linux server2 server2 9999 9000 7 STANDBY 0 MEMBER #run as standby
+server3:9999 Linux server3 server3 9999 9000 7 STANDBY 0 MEMBER #run as standby
</programlisting>
<para>
Stop active server <literal>server1</literal>, then <literal>server2</literal> or
# pcp_watchdog_info -p 9898 -h 192.168.137.150 -U pgpool
Password:
-3 YES server2:9999 Linux server2 server2
+3 3 YES server2:9999 Linux server2 server2
-server2:9999 Linux server2 server2 9999 9000 4 LEADER #server2 is promoted to LEADER
-server1:9999 Linux server1 server1 9999 9000 10 SHUTDOWN #server1 is stopped
-server3:9999 Linux server3 server3 9999 9000 7 STANDBY #server3 runs as STANDBY
+server2:9999 Linux server2 server2 9999 9000 4 LEADER 0 MEMBER #server2 is promoted to LEADER
+server1:9999 Linux server1 server1 9999 9000 10 SHUTDOWN 0 MEMBER #server1 is stopped
+server3:9999 Linux server3 server3 9999 9000 7 STANDBY 0 MEMBER #server3 runs as STANDBY
</programlisting>
<para>
Start <productname>Pgpool-II</productname> (<literal>server1</literal>) which we have stopped again,
[server1]# pcp_watchdog_info -p 9898 -h 192.168.137.150 -U pgpool
Password:
-3 YES server2:9999 Linux server2 server2
+3 3 YES server2:9999 Linux server2 server2
-server2:9999 Linux server2 server2 9999 9000 4 LEADER
-server1:9999 Linux server1 server1 9999 9000 7 STANDBY
-server3:9999 Linux server3 server3 9999 9000 7 STANDBY
+server2:9999 Linux server2 server2 9999 9000 4 LEADER 0 MEMBER
+server1:9999 Linux server1 server1 9999 9000 7 STANDBY 0 MEMBER
+server3:9999 Linux server3 server3 9999 9000 7 STANDBY 0 MEMBER
</programlisting>
</sect3>
<programlisting>
$ pcp_watchdog_info -h localhost -U postgres
- 3 NO Linux_host1.localdomain_9991 host1
+ 3 3 NO Linux_host1.localdomain_9991 host1
- Linux_host1.localdomain_9991 host1 9991 9001 7 STANDBY
- Linux_host2.localdomain_9992 host2 9992 9002 4 LEADER
- Linux_host3.localdomain_9993 host3 9993 9003 7 STANDBY
+ Linux_host1.localdomain_9991 host1 9991 9001 7 STANDBY 0 MEMBER
+ Linux_host2.localdomain_9992 host2 9992 9002 4 LEADER 0 MEMBER
+ Linux_host3.localdomain_9993 host3 9993 9003 7 STANDBY 0 MEMBER
</programlisting>
</para>
<para>
The first output line describes the watchdog cluster information:
1. Total watchdog nodes in the cluster
- 2. Is VIP is up on current node?
- 3. Leader node name
- 4. Leader node host
+ 2. Total watchdog nodes in the cluster with active membership
+ 3. Is VIP is up on current node?
+ 4. Leader node name
+ 5. Leader node host
</literallayout>
<literallayout class="monospaced">
Next is the list of watchdog nodes:
4. watchdog port
5. current node state
6. current node state name
+ 7. current cluster membership status
+ 8. current cluster membership status name
</literallayout>
</para>
<para>
$ pcp_watchdog_info -h localhost -v -U postgres
Watchdog Cluster Information
- Total Nodes : 3
- Remote Nodes : 2
- Quorum state : QUORUM EXIST
- Alive Remote Nodes : 2
- VIP up on local node : NO
- Leader Node Name : Linux_host2.localdomain_9992
- Leader Host Name : localhost
+ Total Nodes : 3
+ Remote Nodes : 2
+ Member Remote Nodes : 2
+ Nodes required for quorum: 2
+ Quorum state : QUORUM EXIST
+ Alive Remote Nodes : 2
+ VIP up on local node : NO
+ Leader Node Name : Linux_host2.localdomain_9992
+ Leader Host Name : localhost
Watchdog Node Information
- Node Name : Linux_host1.localdomain_9991
- Host Name : host1
- Delegate IP : 192.168.1.10
- Pgpool port : 9991
- Watchdog port : 9001
- Node priority : 1
- Status : 7
- Status Name : STANDBY
-
- Node Name : Linux_host2.localdomain_9992
- Host Name : host2
- Delegate IP : 192.168.1.10
- Pgpool port : 9992
- Watchdog port : 9002
- Node priority : 1
- Status : 4
- Status Name : LEADER
-
- Node Name : Linux_host3.localdomain_9993
- Host Name : host3
- Delegate IP : 192.168.1.10
- Pgpool port : 9993
- Watchdog port : 9003
- Node priority : 1
- Status : 7
- Status Name : STANDBY
+ Node Name : Linux_host1.localdomain_9991
+ Host Name : host1
+ Delegate IP : 192.168.1.10
+ Pgpool port : 9991
+ Watchdog port : 9001
+ Node priority : 1
+ Status : 7
+ Status Name : STANDBY
+ Membership Status : MEMBER
+
+ Node Name : Linux_host2.localdomain_9992
+ Host Name : host2
+ Delegate IP : 192.168.1.10
+ Pgpool port : 9992
+ Watchdog port : 9002
+ Node priority : 1
+ Status : 4
+ Status Name : LEADER
+ Membership Status : MEMBER
+
+ Node Name : Linux_host3.localdomain_9993
+ Host Name : host3
+ Delegate IP : 192.168.1.10
+ Pgpool port : 9993
+ Watchdog port : 9003
+ Node priority : 1
+ Status : 7
+ Status Name : STANDBY
+ Membership Status : MEMBER
</programlisting>
</refsect1>
$ pcp_watchdog_info -p 50001 -v
Password:
Watchdog Cluster Information
- Total Nodes : 3
- Remote Nodes : 2
- Quorum state : QUORUM EXIST
- Alive Remote Nodes : 2
- VIP up on local node : NO
- Leader Node Name : Linux_tishii-CF-SX3HE4BP_50004
- Leader Host Name : localhost
+ Total Nodes : 3
+ Remote Nodes : 2
+ Member Remote Nodes : 2
+ Nodes required for quorum: 2
+ Quorum state : QUORUM EXIST
+ Alive Remote Nodes : 2
+ VIP up on local node : NO
+ Leader Node Name : Linux_tishii-CF-SX3HE4BP_50004
+ Leader Host Name : localhost
Watchdog Node Information
- Node Name : Linux_tishii-CF-SX3HE4BP_50000
- Host Name : localhost
- Delegate IP : Not_Set
- Pgpool port : 50000
- Watchdog port : 50002
- Node priority : 1
- Status : 7
- Status Name : STANDBY
-
- Node Name : Linux_tishii-CF-SX3HE4BP_50004
- Host Name : localhost
- Delegate IP : Not_Set
- Pgpool port : 50004
- Watchdog port : 50006
- Node priority : 1
- Status : 4
- Status Name : LEADER
-
- Node Name : Linux_tishii-CF-SX3HE4BP_50008
- Host Name : localhost
- Delegate IP : Not_Set
- Pgpool port : 50008
- Watchdog port : 50010
- Node priority : 1
- Status : 7
- Status Name : STANDBY
+ Node Name : Linux_tishii-CF-SX3HE4BP_50000
+ Host Name : localhost
+ Delegate IP : Not_Set
+ Pgpool port : 50000
+ Watchdog port : 50002
+ Node priority : 1
+ Status : 7
+ Status Name : STANDBY
+ Membership Status : MEMBER
+
+ Node Name : Linux_tishii-CF-SX3HE4BP_50004
+ Host Name : localhost
+ Delegate IP : Not_Set
+ Pgpool port : 50004
+ Watchdog port : 50006
+ Node priority : 1
+ Status : 4
+ Status Name : LEADER
+ Membership Status : MEMBER
+
+ Node Name : Linux_tishii-CF-SX3HE4BP_50008
+ Host Name : localhost
+ Delegate IP : Not_Set
+ Pgpool port : 50008
+ Watchdog port : 50010
+ Node priority : 1
+ Status : 7
+ Status Name : STANDBY
+ Membership Status : MEMBER
</screen>
</para>
</variablelist>
</sect2>
+ <sect2 id="config-watchdog-cluster-membership">
+ <title>Controlling the watchdog cluster membership </title>
+
+ <para>
+ By default the watchdog cluster consists of all watchdog nodes
+ that are defined in the <filename>pgpool.conf</filename> file
+ irrespective of the current state of the node. Whether the node
+ is <literal>LOST</literal>, <literal>SHUTDOWN</literal>
+ or never started, the node is considered as the member of the
+ watchdog cluster definition as long as it is configured in the
+ configuration file.
+ All the majority rule computations for identifying the existence
+ of a quorum and resolving the consensus are made based on the
+ number of watchdog nodes that makes up the watchdog cluster.
+ </para>
+
+ <para>
+ <productname>Pgpool-II</productname> <emphasis>V4.3</emphasis> enables
+ dynamic cluster definition by introducing the concept of
+ <emphasis>Member</emphasis> and <emphasis>Nonmember</emphasis>.
+ If the node's membership gets revoked from the watchdog cluster,
+ then the cluster re-calibrate itself dynamically to adjust all
+ subsequent majority rule computations.
+ </para>
+ <para>
+ All majority rule computations are done based on the number of
+ member watchdog nodes instead of total number of configured nodes.
+ </para>
+
+ <para>
+ For example: In a five node cluster (<filename>pgpool.conf</filename> has five
+ watchdog nodes defined) at-least three nodes need to be alive to make the quorum.
+ With the dynamic cluster membership mechanism the cluster can re-adjust
+ itself to only count the <emphasis>MEMBER</emphasis> nodes
+ (Member node doesn't necessarily need to be alive).
+ That means effectively a single alive node
+ can also fulfill the quorum requirements (depending on the membership criteria settings)
+ if at some point in time the cluster only has one or two member nodes.
+ </para>
+
+ <para>
+ These settings configures when the node is marked as Nonmember.
+ Leaving all these settings to default values retains the
+ pre <emphasis>V4.3</emphasis> behaviour.
+ </para>
+
+ <variablelist>
+
+ <varlistentry id="guc-wd-remove-shutdown-nodes" xreflabel="wd_remove_shutdown_nodes">
+ <term><varname>wd_remove_shutdown_nodes</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>wd_remove_shutdown_nodes</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ When enabled, the <literal>SHUTDOWN</literal> nodes are immediately marked as Nonmember
+ and removed from the cluster. If the previously shutdown node starts again,
+ it gets added to cluster automatically.
+ </para>
+ <para> Default is off. </para>
+ </listitem>
+ </varlistentry>
+
+
+ <varlistentry id="guc-wd-lost-node-removal-timeout" xreflabel="wd_lost_node_removal_timeout">
+ <term><varname>wd_lost_node_removal_timeout</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wd_lost_node_removal_timeout</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Timeout in seconds to mark the <literal>LOST</literal> watchdog node as
+ Nonmember and remove from the cluster.
+ When LOST node re-connects to the cluster, its cluster membership is restored.
+ </para>
+ <para>
+ Setting the timeout to <literal>0</literal> disables the removal
+ of LOST nodes from cluster.
+ </para>
+ <para> Default is 0. </para>
+
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="guc-wd-no-show-node-removal-timeout" xreflabel="wd_no_show_node_removal_timeout">
+ <term><varname>wd_no_show_node_removal_timeout</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>wd_no_show_node_removal_timeout</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Timeout in seconds to mark the node as Nonmember if it doesn't show up
+ at cluster initialisation. Nonmember node becomes the cluster Member as
+ soon as it starts up and connects to the cluster.
+ </para>
+ <para>
+ Setting the timeout to <literal>0</literal> disables the removal
+ of <literal>NO-SHOW</literal> nodes from cluster.
+ </para>
+ <para> Default is 0. </para>
+
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </sect2>
+
<sect2 id="config-watchdog-lifecheck">
<title>Life checking <productname>Pgpool-II</productname> </title>