Stay organized with collections
Save and categorize content based on your preferences.
The ML.CONFUSION_MATRIX function
This document describes the ML.CONFUSION_MATRIX function, which you can use
to return a confusion matrix for the input classification model and input data.
Syntax
ML.CONFUSION_MATRIX(
MODEL `PROJECT_ID.DATASET.MODEL_NAME`,
[, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }]
STRUCT(
[THRESHOLD AS threshold]
[, TRIAL_ID AS trial_id]))
Arguments
ML.CONFUSION_MATRIX takes the following arguments:
PROJECT_ID: the project that contains the
resource.
DATASET: the dataset that contains the
resource.
MODEL: the name of the model.
TABLE: the name of the input table that contains
the evaluation data.
If TABLE is specified, the input column names in the table must match the
column names in the model, and their types should be compatible according to
BigQuery implicit coercion rules.
The input must have a column that matches the
label column name provided during training. This value is provided using the
input_label_cols option. If input_label_cols is unspecified, the column
named label in the training data is used.
If you don't specify either TABLE or QUERY_STATEMENT,
ML.CONFUSION_MATRIX computes the confusion matrix results as follows:
If the data is split during training, the split evaluation data is used to
compute the confusion matrix results.
If the data is not split during training, the entire training input is
used to compute the confusion matrix results.
QUERY_STATEMENT: a GoogleSQL query that is
used to generate the evaluation data. For the supported SQL syntax of the
QUERY_STATEMENT clause in GoogleSQL, see
Query syntax.
If QUERY_STATEMENT is specified, the input column names from the query
must match the column names in the model, and their types should be
compatible according to BigQuery
implicit coercion rules.
The input must have a column that matches the label column name provided
during training. This value is provided using the input_label_cols option.
If input_label_cols is unspecified, the column named label in the
training data is used. The extra columns are ignored.
If you used the
TRANSFORM clause
in the CREATE MODEL statement that created the model, then only the input
columns present in the TRANSFORM clause must appear in QUERY_STATEMENT.
If you don't specify either TABLE or QUERY_STATEMENT,
ML.CONFUSION_MATRIX computes the confusion matrix results as follows:
If the data is split during training, the split evaluation data is used to
compute the confusion matrix results.
If the data is not split during training, the entire training input is
used to compute the confusion matrix results.
THRESHOLD: a FLOAT64 value that specifies a custom
threshold for the binary-class classification model to use for evaluation. The
default value is 0.5.
A 0 value for precision or recall means that the selected threshold
produced no true positive labels. A NaN value for precision means that the
selected threshold produced no positive labels, neither true positives nor
false positives.
If both TABLE and QUERY_STATEMENT are unspecified, you can't use a
threshold.
You can't use THRESHOLD with multiclass classification models.
TRIAL_ID: an INT64 value that identifies the
hyperparameter tuning trial that you want the function to evaluate. The
function uses the optimal trial by default. Only specify this argument if you
ran hyperparameter tuning when creating the model.
Output
The output columns of the ML.CONFUSION_MATRIX function depend on the model.
The first output column is always expected_label. There are N additional
columns, one for each class in the trained model. The names of the additional
columns depend on the class labels used to train the model.
If the training class labels all conform to BigQuery
column naming rules, the labels are used
as the column names. Columns that don't conform to naming rules are altered to
conform to the column naming rules and to be unique. For example, if the labels
are 0 and 1, the output column names are _0 and _1.
The columns are ordered based on the class labels in ascending order. If the
labels in the evaluation data match those in the training data, the
True Positives
are shown on the diagonal from top left to bottom right. The expected (or
actual) labels are listed one per row, and the predicted labels are listed one
per column.
The values in the expected_label column are the exact values and type passed
into ML.CONFUSION_MATRIX in the label column of the evaluation data. This is
true even if they don't exactly match the values or type used during training.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThe \u003ccode\u003eML.CONFUSION_MATRIX\u003c/code\u003e function generates a confusion matrix for a classification model using input data, allowing for evaluation of model performance.\u003c/p\u003e\n"],["\u003cp\u003eThe function requires a model (\u003ccode\u003eproject_id.dataset.model\u003c/code\u003e) and either a table (\u003ccode\u003eproject_id.dataset.table\u003c/code\u003e) or a query statement to provide evaluation data.\u003c/p\u003e\n"],["\u003cp\u003eThe input data must have columns matching the model's column names and include a label column, and when unspecified, \u003ccode\u003einput_label_cols\u003c/code\u003e defaults to a column named "label".\u003c/p\u003e\n"],["\u003cp\u003eYou can optionally specify a custom \u003ccode\u003ethreshold\u003c/code\u003e for binary-class classification models, or a specific \u003ccode\u003etrial_id\u003c/code\u003e for hyperparameter-tuned models, but \u003ccode\u003ethreshold\u003c/code\u003e is unavailable when both \u003ccode\u003etable\u003c/code\u003e and \u003ccode\u003equery_statement\u003c/code\u003e are omitted.\u003c/p\u003e\n"],["\u003cp\u003eThe output of \u003ccode\u003eML.CONFUSION_MATRIX\u003c/code\u003e presents the expected labels and the predicted labels, with the True Positives appearing diagonally, and output column names are based on class labels.\u003c/p\u003e\n"]]],[],null,["# The ML.CONFUSION_MATRIX function\n================================\n\nThis document describes the `ML.CONFUSION_MATRIX` function, which you can use\nto return a confusion matrix for the input classification model and input data.\n\nSyntax\n------\n\n```sql\nML.CONFUSION_MATRIX(\n MODEL `PROJECT_ID.DATASET.MODEL_NAME`,\n [, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }]\n STRUCT(\n [THRESHOLD AS threshold]\n [, TRIAL_ID AS trial_id]))\n```\n\n### Arguments\n\n`ML.CONFUSION_MATRIX` takes the following arguments:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the project that contains the resource.\n- \u003cvar translate=\"no\"\u003eDATASET\u003c/var\u003e: the dataset that contains the resource.\n- \u003cvar translate=\"no\"\u003eMODEL\u003c/var\u003e: the name of the model.\n- \u003cvar translate=\"no\"\u003eTABLE\u003c/var\u003e: the name of the input table that contains\n the evaluation data.\n\n If `TABLE` is specified, the input column names in the table must match the\n column names in the model, and their types should be compatible according to\n BigQuery [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n The input must have a column that matches the\n label column name provided during training. This value is provided using the\n `input_label_cols` option. If `input_label_cols` is unspecified, the column\n named `label` in the training data is used.\n\n If you don't specify either `TABLE` or `QUERY_STATEMENT`,\n `ML.CONFUSION_MATRIX` computes the confusion matrix results as follows:\n - If the data is split during training, the split evaluation data is used to compute the confusion matrix results.\n - If the data is not split during training, the entire training input is used to compute the confusion matrix results.\n- \u003cvar translate=\"no\"\u003eQUERY_STATEMENT\u003c/var\u003e: a GoogleSQL query that is\n used to generate the evaluation data. For the supported SQL syntax of the\n `QUERY_STATEMENT` clause in GoogleSQL, see\n [Query syntax](/bigquery/docs/reference/standard-sql/query-syntax#sql_syntax).\n\n If `QUERY_STATEMENT` is specified, the input column names from the query\n must match the column names in the model, and their types should be\n compatible according to BigQuery\n [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n The input must have a column that matches the label column name provided\n during training. This value is provided using the `input_label_cols` option.\n If `input_label_cols` is unspecified, the column named `label` in the\n training data is used. The extra columns are ignored.\n\n If you used the\n [`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\n in the `CREATE MODEL` statement that created the model, then only the input\n columns present in the `TRANSFORM` clause must appear in `QUERY_STATEMENT`.\n\n If you don't specify either `TABLE` or `QUERY_STATEMENT`,\n `ML.CONFUSION_MATRIX` computes the confusion matrix results as follows:\n - If the data is split during training, the split evaluation data is used to compute the confusion matrix results.\n - If the data is not split during training, the entire training input is used to compute the confusion matrix results.\n- \u003cvar translate=\"no\"\u003eTHRESHOLD\u003c/var\u003e: a `FLOAT64` value that specifies a custom\n threshold for the binary-class classification model to use for evaluation. The\n default value is `0.5`.\n\n A `0` value for precision or recall means that the selected threshold\n produced no true positive labels. A `NaN` value for precision means that the\n selected threshold produced no positive labels, neither true positives nor\n false positives.\n\n If both `TABLE` and `QUERY_STATEMENT` are unspecified, you can't use a\n threshold.\n\n You can't use `THRESHOLD` with multiclass classification models.\n- \u003cvar translate=\"no\"\u003eTRIAL_ID\u003c/var\u003e: an `INT64` value that identifies the\n hyperparameter tuning trial that you want the function to evaluate. The\n function uses the optimal trial by default. Only specify this argument if you\n ran hyperparameter tuning when creating the model.\n\n| **Note:** `ML.CONFUSION_MATRIX` requires input data with some models, and returns an error if it is absent. If this occurs, provide input data when using `ML.CONFUSION_MATRIX` with these models.\n\nOutput\n------\n\nThe output columns of the `ML.CONFUSION_MATRIX` function depend on the model.\nThe first output column is always `expected_label`. There are `N` additional\ncolumns, one for each class in the trained model. The names of the additional\ncolumns depend on the class labels used to train the model.\n\nIf the training class labels all conform to BigQuery\n[column naming rules](/bigquery/docs/schemas#column_names), the labels are used\nas the column names. Columns that don't conform to naming rules are altered to\nconform to the column naming rules and to be unique. For example, if the labels\nare `0` and `1`, the output column names are `_0` and `_1`.\n\nThe columns are ordered based on the class labels in ascending order. If the\nlabels in the evaluation data match those in the training data, the\n[True Positives](https://developers.google.com/machine-learning/glossary/#true_positive)\nare shown on the diagonal from top left to bottom right. The expected (or\nactual) labels are listed one per row, and the predicted labels are listed one\nper column.\n\nThe values in the `expected_label` column are the exact values and type passed\ninto `ML.CONFUSION_MATRIX` in the label column of the evaluation data. This is\ntrue even if they don't exactly match the values or type used during training.\n\nLimitations\n-----------\n\n`ML.CONFUSION_MATRIX` doesn't support\n[imported TensorFlow models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-tensorflow).\n\nExamples\n--------\n\nThe following examples demonstrate the use of the `ML.CONFUSION_MATRIX` function.\n\n### `ML.CONFUSION_MATRIX` with a query statement\n\nThe following example returns the confusion matrix for a logistic\nregression model named `mydataset.mymodel` in your default project: \n\n```sql\nSELECT\n *\nFROM\n ML.CONFUSION_MATRIX(MODEL `mydataset.mymodel`,\n (\n SELECT\n *\n FROM\n `mydataset.mytable`))\n```\n\n### `ML.CONFUSION_MATRIX` with a custom threshold\n\nThe following example returns the confusion matrix for a logistic\nregression model named `mydataset.mymodel` in your default project: \n\n```sql\nSELECT\n *\nFROM\n ML.CONFUSION_MATRIX(MODEL `mydataset.mymodel`,\n (\n SELECT\n *\n FROM\n `mydataset.mytable`),\n STRUCT(0.6 AS threshold))\n```\n\nWhat's next\n-----------\n\n- For information about model evaluation, see [BigQuery ML model evaluation overview](/bigquery/docs/evaluate-overview).\n- For information about the supported SQL statements and functions for each model type, see [End-to-end user journey for each model](/bigquery/docs/e2e-journey)."]]