This section describes how to use AdminAPI to monitor an InnoDB Cluster.
To get information about the structure of the InnoDB Cluster
itself, use the
function:
Cluster.describe()
mysql-js> cluster.describe();
{
"clusterName": "testCluster",
"defaultReplicaSet": {
"name": "default",
"topology": [
{
"address": "ic-1:3306",
"label": "ic-1:3306",
"role": "HA"
},
{
"address": "ic-2:3306",
"label": "ic-2:3306",
"role": "HA"
},
{
"address": "ic-3:3306",
"label": "ic-3:3306",
"role": "HA"
}
]
}
}
The output from this function shows the structure of the
InnoDB Cluster including all of its configuration information,
and so on. The address, label and role values match those
described at Checking a cluster's Status with
.
Cluster.status()
Cluster objects provide the status() method
that enables you to check how a cluster is running. Before you
can check the status of the InnoDB Cluster, you need to get a
reference to the InnoDB Cluster object by connecting to any of
its instances. However, if you want to make changes to the
configuration of the cluster, you must connect to a "R/W"
instance. Issuing status() retrieves the
status of the cluster based on the view of the cluster which the
server instance you are connected to is aware of and outputs a
status report.
The instance's state in the cluster directly influences the
information provided in the status report. Therefore ensure
the instance you are connected to has a status of
ONLINE.
For information about how the InnoDB Cluster is running, use
the cluster's status() method:
mysql-js> var cluster = dba.getCluster()
mysql-js> cluster.status()
{
"clusterName": "testcluster",
"defaultReplicaSet": {
"name": "default",
"primary": "ic-1:3306",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"ic-1:3306": {
"address": "ic-1:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE"
"version": "8.0.30"
},
"ic-2:3306": {
"address": "ic-2:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE"
"version": "8.0.30"
},
"ic-3:3306": {
"address": "ic-3:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE"
"version": "8.0.30"
}
}
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "mysql://icadmin@ic-1:3306"
}
The output of
provides the following information:
Cluster.status()
clusterName: name assigned to this cluster duringdba.createCluster().defaultReplicaSet: the server instances which belong to an InnoDB Cluster and contain the data set.primary: displayed when the cluster is operating in single-primary mode only. Shows the address of the current primary instance. If this field is not displayed, the cluster is operating in multi-primary mode.ssl: whether secure connections are used by the cluster or not. Shows values ofREQUIREDorDISABLED, depending on how thememberSslModeoption was configured during eithercreateCluster()oraddInstance(). The value returned by this parameter corresponds to the value of thegroup_replication_ssl_modeserver variable on the instance. See Section 8.6, “Securing InnoDB Cluster”.-
status: The status of the InnoDB Cluster. The status describes the high availability provided by this cluster. The status is one of the following:OK: The cluster is online and can tolerate up tonfailures. There are three or more members in the cluster, and they are functioning.OK_PARTIAL: The cluster is online and can tolerate up tonfailures. At least three of the member servers in the cluster are in Group Replication's online state. However, one or more member servers are not currently participating as active members of the cluster.OK_NO_TOLERANCE: The cluster is not tolerant to any failures.OK_NO_TOLERANCE_PARTIAL: The cluster is not tolerant to any failures. One or two member servers in the cluster are online, but one or more servers are in an offline, recovering, error, or unreachable state. The cluster does not have sufficient tolerance for failures because of the unavailability of some members.NO_QUORUM: The cluster does not have quorum, meaning that a majority of the replication group's member servers are unavailable for agreeing on a decision, and cannot process write transactions.OFFLINE: All members of the group are offline.ERROR: There are no online members in the cluster.UNREACHABLE: There is no connectivity to any online members.UNKNOWN: There is no connectivity to any online members.FENCED_WRITES: The cluster is fenced from write traffic.
-
topology: The status of the MySQL Server instance. The status is one of the following:Host name of instance: The host name of an instance, for example"localhost:3310".memberRolethe Member Role as reported by the Group Replication plugin, see theMEMBER_ROLEcolumn of thereplication_group_memberstable.-
mode: whether the server is read-write ("R/W") or read-only ("R/O"). This is derived from the current state of thesuper_read_onlyvariable on the instance, and whether the cluster has quorum. In previous versions the value of mode was derived from whether the instance was serving as a primary or secondary instance. Usually if the instance is a primary, then the mode is "R/W", and if the instance is a secondary the mode is "R/O". Any instances in a cluster that have no visible quorum are marked as "R/O", regardless of the state of thesuper_read_onlyvariable.NoteIf the member
statusis anything other thanONLINE,modeis reported asn/a. -
replicationLag: returns one of the following values:-
The time difference between the last transaction commit timestamp and the last transaction applied timestamp, in HH:MM:SS format.
If multiple workers are used, the value is retrieved from the worker executing the oldest transaction.
null: The replication connection or SQL thread is not running.applier_queue_applied: The applier queue has applied everything. That is, if the last queued transaction and the last applied transaction are the same, or the applying transaction is 0.
-
role: what function this instance provides in the cluster. Currently only HA, for high availability.-
status: The status of this element of the cluster. The status is one of the following:ONLINE: The instance is online and participating in the cluster.OFFLINE: The instance has lost connection to the other instances.RECOVERING: The instance is attempting to synchronize with the cluster by retrieving transactions it needs before it can become an online member.UNREACHABLE: The instance has lost communication with the cluster.-
ERROR: The instance has encountered an error during the recovery phase or while applying a transaction.ImportantOnce an instance enters
ERRORstate, thesuper_read_onlyoption is set toON. To leave theERRORstate you must manually configure the instance withsuper_read_only=OFF. -
(MISSING): The state of an instance which is part of the configured cluster, but is currently unavailable.NoteThe
MISSINGstate is specific to InnoDB Cluster, it is not a state generated by Group Replication. MySQL Shell uses this state to indicate instances that are registered in the metadata, but cannot be found in the live cluster view.
groupInformationSourceMember: the internal connection used to get information about the cluster, shown as a URI-like connection string. Usually the connection initially used to create the cluster.
version: the MySQL Server version running on the instance. See Checking the MySQL Version on Instances for more information.
To display more information about the cluster use the
extended option. The
extended option supports integer or Boolean
values. To configure the additional information that
provides, use the following values:
Cluster.status({'extended':value})
0: disables the additional information, the default
1: includes information about the Group Replication Protocol Version, Group name, communication stack, cluster member UUIDs, cluster member roles and states as reported by Group Replication, and the list of fenced system variables
2: includes information about transactions processed by connection and applier
3: includes more detailed statistics about the replication performed by each cluster member.
Setting extended using Boolean values is the
equivalent of setting the integer values 0 and 1.
When you issue
,
or the Cluster.status({'extended':1})extended option is set to
true, the output includes:
-
the following additional attributes for the
defaultReplicaSetobject:-
GRProtocolVersion: the Group Replication Protocol Version being used in the cluster.TipInnoDB Cluster manages the Group Replication Protocol version being used automatically, see InnoDB Cluster and Group Replication Protocol for more information.
communicationStack: the communication stack in use by the cluster. Possible values areXCOMorMYSQL. See Section 8.5.9, “Configuring the Group Replication Communication Stack” for more information.groupName: the group's name, a UUID.groupViewChangeUuid: the value ofgroup_replication_view_change_uuid.groupViewId: the current view identifier for this group. This value is taken from theVIEW_IDcolumn of thereplication_group_member_statstable.-
paxosSingleLeader: displays the value ofgroup_replication_paxos_single_leader.NoteThis is only available on MySQL Server 8.0.31, or higher, because MySQL Shell requires the information provided by
WRITE_CONSENSUS_SINGLE_LEADER_CAPABLEin thereplication_group_communication_informationtable, which was introduced in MySQL 8.0.31.
-
-
the following additional attributes for each object of the
topologyobject:fenceSysVarsa list containing the name of the fenced system variables which are configured by AdminAPI. Currently the fenced system variables considered areread_only,super_read_onlyandoffline_mode. The system variables are listed regardless of their value.instanceErrorsfor each instance, displaying any diagnostic information that can be detected for the instance. For example, if the instance is a secondary and thesuper_read_onlyvariable is not set toON, then a warning is shown. This information can be used to troubleshoot errors.memberIdEach cluster member UUID.memberStatethe Member State as reported by the Group Replication plugin, see theMEMBER_STATEcolumn of thereplication_group_memberstable.
To see information about recovery and regular transaction I/O,
applier worker thread statistics and any lags; applier
coordinator statistics, if the parallel replication applier is
enabled; error, and other information from the receiver and
applier threads, use a value of 2 or 3 for
extended. When you use these values, a
connection to each instance in the cluster is opened so that
additional instance specific statistics can be queried. The
exact statistics that are included in the output depend on the
state and configuration of the instance and the server version.
This information matches that shown in the
replication_group_member_stats
table, see the descriptions of the matching columns for more
information. Instances which are ONLINE have
a transactions section included in the
output. Instances which are RECOVERING have a
recovery section included in the output. When
you set extended to 2, in either case, these
sections can contain the following:
appliedCount: seeCOUNT_TRANSACTIONS_REMOTE_APPLIEDcheckedCount: seeCOUNT_TRANSACTIONS_CHECKEDcommittedAllMembers: seeTRANSACTIONS_COMMITTED_ALL_MEMBERSconflictsDetectedCount: seeCOUNT_CONFLICTS_DETECTEDinApplierQueueCount: seeCOUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUEinQueueCount: seeCOUNT_TRANSACTIONS_IN_QUEUElastConflictFree: seeLAST_CONFLICT_FREE_TRANSACTIONproposedCount: seeCOUNT_TRANSACTIONS_LOCAL_PROPOSEDrollbackCount: seeCOUNT_TRANSACTIONS_LOCAL_ROLLBACK
When you set extended to 3, the
connection section shows information from the
replication_connection_status
table.
The currentlyQueueing section has information
about the transactions currently queued:
immediateCommitTimestamp: seeQUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPimmediateCommitToNowTime: seeQUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPminusNOW()originalCommitTimestamp: seeQUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPoriginalCommitToNowTime: seeQUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPminusNOW()startTimestamp: seeQUEUEING_TRANSACTION_START_QUEUE_TIMESTAMPtransaction: seeQUEUEING_TRANSACTIONlastHeartbeatTimestamp: seeLAST_HEARTBEAT_TIMESTAMP
The lastQueued section has information about
the most recently queued transaction:
endTimestamp: seeLAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMPimmediateCommitTimestamp: seeLAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPimmediateCommitToEndTime:LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPminusNOW()originalCommitTimestamp: seeLAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPoriginalCommitToEndTime:LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPminusNOW()queueTime:LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMPminusLAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMPstartTimestamp: seeLAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMPtransaction: seeLAST_QUEUED_TRANSACTIONreceivedHeartbeats: seeCOUNT_RECEIVED_HEARTBEATSreceivedTransactionSet: seeRECEIVED_TRANSACTION_SETthreadId: seeTHREAD_ID
Instances which are using a multithreaded replica have a
workers section which contains information
about the worker threads, and matches the information shown by
the
replication_applier_status_by_worker
table.
The lastApplied section shows the following
information about the last transaction applied by the worker:
applyTime: seeLAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMPminusLAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMPendTimestamp: seeLAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMPimmediateCommitTimestamp: seeLAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPimmediateCommitToEndTime: seeLAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPminusNOW()originalCommitTimestamp: seeLAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPoriginalCommitToEndTime: seeLAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPminusNOW()startTimestamp: seeLAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMPtransaction: seeLAST_APPLIED_TRANSACTION
The currentlyApplying section shows the
following information about the transaction currently being
applied by the worker:
immediateCommitTimestamp: seeAPPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPimmediateCommitToNowTime: seeAPPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPminusNOW()originalCommitTimestamp: seeAPPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPoriginalCommitToNowTime: seeAPPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPminusNOW()startTimestamp: seeAPPLYING_TRANSACTION_START_APPLY_TIMESTAMPtransaction: seeAPPLYING_TRANSACTION
The lastProcessed section has the following
information about the last transaction processed by the worker:
bufferTime:LAST_PROCESSED_TRANSACTION_END_BUFFER_TIMESTAMPminusLAST_PROCESSED_TRANSACTION_START_BUFFER_TIMESTAMPendTimestamp: seeLAST_PROCESSED_TRANSACTION_END_BUFFER_TIMESTAMPimmediateCommitTimestamp: seeLAST_PROCESSED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPimmediateCommitToEndTime:LAST_PROCESSED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPminusLAST_PROCESSED_TRANSACTION_END_BUFFER_TIMESTAMPoriginalCommitTimestamp: seeLAST_PROCESSED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPoriginalCommitToEndTime:LAST_PROCESSED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPminusLAST_PROCESSED_TRANSACTION_END_BUFFER_TIMESTAMPstartTimestamp: seeLAST_PROCESSED_TRANSACTION_START_BUFFER_TIMESTAMPtransaction: seeLAST_PROCESSED_TRANSACTION
If the parallel replication applier is enabled, then the number
of objects in the workers array in
transactions or recovery
matches the number of configured workers and an additional
coordinator object is included. The information shown matches
the information in the
replication_applier_status_by_coordinator
table. The object can contain:
The currentlyProcessing section has the
following information about the transaction being processed by
the worker:
immediateCommitTimestamp: seePROCESSING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPimmediateCommitToNowTime:PROCESSING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMPminusNOW()originalCommitTimestamp: seePROCESSING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPoriginalCommitToNowTime:PROCESSING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMPminusNOW()startTimestamp: seePROCESSING_TRANSACTION_START_BUFFER_TIMESTAMPtransaction: seePROCESSING_TRANSACTION
worker objects have the following information
if an error was detected in the
replication_applier_status_by_worker
table:
lastErrno: seeLAST_ERROR_NUMBERlastError: seeLAST_ERROR_MESSAGElastErrorTimestamp: seeLAST_ERROR_TIMESTAMP
connection objects have the following
information if an error was detected in the
replication_connection_status
table:
lastErrno: seeLAST_ERROR_NUMBERlastError: seeLAST_ERROR_MESSAGElastErrorTimestamp: seeLAST_ERROR_TIMESTAMP
coordinator objects have the following
information if an error was detected in the
replication_applier_status_by_coordinator
table:
lastErrno: seeLAST_ERROR_NUMBERlastError: seeLAST_ERROR_MESSAGElastErrorTimestamp: seeLAST_ERROR_TIMESTAMP
The output of
shows information about the progress of recovery operations for
instances in Cluster.status()RECOVERING state. Information is
shown for instances recovering using either MySQL Clone, or
incremental recovery. Monitor these fields:
The
recoveryStatusTextfield includes information about the type of recovery being used. When MySQL Clone is working the field shows “Cloning in progress”. When incremental recovery is working the field shows “Distributed recovery in progress”.-
When MySQL Clone is being used, the
recoveryfield includes a dictionary with the following fields:cloneStartTime: The timestamp of the start of the clone processcloneState: The state of the clone progresscurrentStage: The current stage which the clone process has reachedcurrentStageProgress: The current stage progress as a percentage of completioncurrentStageState: The current stage state
Example
output, trimmed for brevity:Cluster.status()... "recovery": { "cloneStartTime": "2019-07-15 12:50:22.730", "cloneState": "In Progress", "currentStage": "FILE COPY", "currentStageProgress": 61.726837675213865, "currentStageState": "In Progress" }, "recoveryStatusText": "Cloning in progress", ... -
When incremental recovery is being used and the
extendedoption is set to 1 or greater, therecoveryfield includes a dictionary with the following fields:state: The state of thegroup_replication_recoverychannel-
recoveryChannel: Displayed for instances performing incremental recovery or in which the recovery channel status is not off. Incremental recovery utilizes the receiver thread to receive transactions from the source, and the applier thread applies the received transactions on the instance. Provides the following information:applierQueuedTransactionSetSize: The number of transactions currently queued, which are waiting to be applied.applierState: The current state of the replication applier, eitherONorOFF.-
applierStatus: The current status of the applier threads. An aggregation of the states shown in theapplierThreadStatefield. Can be one of:APPLIED_ALL: there are no queued transactions waiting to be appliedAPPLYING: there are transactions being appliedON: thread is connected and there are no queued transactionsERROR: there was an error while applying transactionsOFF: the applier thread is disabled
applierThreadState: The current state of any applier threads. Provides detailed information about exactly what the applier thread is doing. For more information, see Replication SQL Thread States.-
receiverStatus: The current status of the receiver thread. An aggregation of the states shown in thereceiverThreadStatefield. Can be one of:ON: the receiver thread has successfully connected and is ready to receiveCONNECTING: the receiver thread is connecting to the sourceERROR: there was an error while receiving transactionsOFF: the receiver thread has gracefully disconnected
receiverThreadState: The current state of the receiver thread. Provides detailed information about exactly what the receiver thread is doing. For more information, see Replication I/O (Receiver) Thread States.source: The source of the transactions which are being applied.
Example
output, trimmed for brevity:Cluster.status()... "recovery": { "recoveryChannel": { "applierQueuedTransactionSetSize": 2284, "applierStatus": "APPLYING", "applierThreadState": "Opening tables", "receiverStatus": "ON", "receiverThreadState": "Queueing master event to the relay log", "source": "ic-2:3306" }, "state": "ON" }, ...
Group Replication has the concept of a communication protocol for the group, see Setting a Group's Communication Protocol Version for more information. The Group Replication communication protocol version usually has to be managed explicitly, and set to accommodate the oldest MySQL Server version that you want the group to support. However, InnoDB Cluster automatically and transparently manages the communication protocol versions of its members, whenever the cluster topology is changed using AdminAPI operations. A cluster always uses the most recent communication protocol version that is supported by all the instances that are currently part of the cluster or joining it.
When an instance is added to, removed from, or rejoins the cluster, or a rescan or reboot operation is carried out on the cluster, the communication protocol version is automatically set to a version supported by the instance that is now at the earliest MySQL Server version.
When you carry out a rolling upgrade by removing instances from the cluster, upgrading them, and adding them back into the cluster, the communication protocol version is automatically upgraded when the last remaining instance at the old MySQL Server version is removed from the cluster prior to its upgrade.
To see the communication protocol version being used in a
cluster, use the
function with the Cluster.status()extended option enabled.
The communication protocol version is returned in the
GRProtocolVersion field, provided that the
cluster has quorum and no cluster members are unreachable.
The following operations can report information about the MySQL Server version running on the instance:
Cluster.status()Cluster.describe()Cluster.rescan()
The behavior varies depending on the MySQL Server version of the
Cluster object session.
-
Cluster.status()If either of the following requirements are met, a
versionstring attribute is returned for each instance JSON object of thetopologyobject:The
Clusterobject's current session is version 8.0.11 or later.The
Clusterobject's current session is running a version earlier than version 8.0.11 but theextendedoption is set to 3.
For example on an instance running version 8.0.16:
"topology": { "ic-1:3306": { "address": "ic-1:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE", "version": "8.0.16" } -
Cluster.describe()If the
Clusterobject's current session is version 8.0.11 or later, aversionstring attribute is returned for each instance JSON object of thetopologyobjectFor example on an instance running version 8.0.16:
"topology": [ { "address": "ic-1:3306", "label": "ic-1:3306", "role": "HA", "version": "8.0.16" } ] -
Cluster.rescan()If the
Clusterobject's current session is version 8.0.11 or later, and theoperation detects instances which do not belong to the cluster, aCluster.rescan()versionstring attribute is returned for each instance JSON object of thenewlyDiscoveredInstanceobject.For example on an instance running version 8.0.16:
"newlyDiscoveredInstances": [ { "host": "ic-4:3306", "member_id": "82a67a06-2ba3-11e9-8cfc-3c6aa7197deb", "name": null, "version": "8.0.16" } ]