Stay organized with collections
Save and categorize content based on your preferences.
The ML.RECOMMEND function
This document describes the ML.RECOMMEND function, which lets you generate
a predicted rating for every user-item row
combination for a
matrix factorization model.
Because the input data for a
matrix factorization model tends to be a sparse matrix with missing values,
ML.RECOMMEND can return the predictions for those missing values without
requiring specification of each entry.
Syntax
ML.RECOMMEND(
MODEL `PROJECT_ID.DATASET.MODEL_NAME`,
[, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }]
[, STRUCT(TRIAL_ID AS trial_id)])
Arguments
ML.RECOMMEND takes the following arguments:
PROJECT_ID: the project that contains the
resource.
DATASET: the dataset that contains the
resource.
MODEL: the name of the model.
TABLE: The name of the input table that contains the
user and item data.
If you specify input data by using either the TABLE or QUERY_STATEMENT
argument, the user and item columns must match the user and item columns in
the model, and their types must be compatible according to
BigQuery
implicit coercion rules.
If the input table does not contain both the user and item column, the input
table can only contain one column. If the table contains both the user and
item columns, then the non-user or item columns are passed through and
available for use in the query.
QUERY_STATEMENT: The GoogleSQL query that is
used to generate the evaluation data. For the supported SQL syntax for the
QUERY_STATEMENT clause in GoogleSQL, see
Query syntax.
If you specify input data by using either the TABLE or QUERY_STATEMENT
argument, the user and item columns must match the user and item columns in
the model, and their types must be compatible according to
BigQuery
implicit coercion rules.
TRIAL_ID: an INT64 value that identifies the
hyperparameter tuning trial that you want the function to evaluate. The
function uses the optimal trial by default. Only specify this argument if you
ran hyperparameter tuning when creating the model.
Output
ML.RECOMMEND outputs at least 3 columns for all cases; the user column, the
item column and a column for predicted recommendations.
The output of ML.RECOMMEND is computed as follows:
If both the user and item columns are in the input data,
then ML.RECOMMEND returns a rating for each user-item pair.
If only the user or only the item is specified (for example, if the table
identified by the table argument only
contains the user column), then all the item ratings for every user in the
table are outputted.
If either the user or item feature was not in the training dataset, the rating
that is returned is the intercept of the feature that was provided, either
item or user, added to the global__intercept__ offset. For example,
global__intercept__ + __intercept__['user_a'].
If input data is specified but does not provide
either the user or item column, ML.RECOMMEND returns an error.
If no input data is specified, ML.RECOMMEND
outputs the ratings for every user and item combination seen during training.
If the model was trained with feedback_type=EXPLICIT, a user column called
user, and an item column called item, then ML.RECOMMEND returns the
following columns:
user: a STRING value containing the user data.
item: a STRING value containing the item data
predicted_<rating_col_name>: a FLOAT64 value that contains the rating for
each user-item pair. Because the input ratings from training are assumed to be
explicit feedback, the predicted ratings are approximately in the range of the
original input, although ratings outside the range are also normal.
If the model was trained with feedback_type=IMPLICIT, a user column called
user, and an item column called item, then ML.RECOMMEND returns the
following columns:
user: a STRING value containing the user data.
item: a STRING value containing the item data
predicted_<rating_col_name>_confidence: a FLOAT64 value that contains the
relative confidence for each user-item pair. The input ratings from
training are assumed to be a proxy for user confidence. Therefore, if the model
has converged, the predicted confidences lie between approximately 0 and 1 (but
can lie just outside that range). If the model hasn't converged, the predicted
confidences can be any value. If your model isn't converging and your ratings
are very large, try decreasing the WALS_ALPHA value that's specified in the
CREATE MODEL statement for the model. If your model isn't converging and
your ratings are very small, try increasing the WALS_ALPHA value.
Examples
The following examples show how to use the ML.RECOMMEND function.
No input data
The following example generates predicted ratings for every
user-item pair in the inputs of mymodel because there is no input data
specified.
SELECT*FROMML.RECOMMEND(MODEL`mydataset.mymodel`)
With input data
The following example generates predicted ratings for each user-item row in
mydataset.mytable assuming that mydataset.mymodel was trained using the user
column user and item column item.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eThe \u003ccode\u003eML.RECOMMEND\u003c/code\u003e function generates predicted ratings for user-item combinations using a matrix factorization model in BigQuery.\u003c/p\u003e\n"],["\u003cp\u003eIt can predict missing values in sparse matrices without requiring each entry to be specified.\u003c/p\u003e\n"],["\u003cp\u003eThe function can take an optional input table or query to filter or specify user-item combinations for predictions.\u003c/p\u003e\n"],["\u003cp\u003eOutput includes at least three columns: \u003ccode\u003euser\u003c/code\u003e, \u003ccode\u003eitem\u003c/code\u003e, and a column for predicted ratings or confidence levels, with the exact output depending on if the training was \u003ccode\u003eEXPLICIT\u003c/code\u003e or \u003ccode\u003eIMPLICIT\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eIf no input data is specified, it outputs ratings for all user and item combinations present during training.\u003c/p\u003e\n"]]],[],null,["# The ML.RECOMMEND function\n=========================\n\nThis document describes the `ML.RECOMMEND` function, which lets you generate\na predicted rating for every user-item row\ncombination for a\n[matrix factorization model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-matrix-factorization).\nBecause the input data for a\nmatrix factorization model tends to be a sparse matrix with missing values,\n`ML.RECOMMEND` can return the predictions for those missing values without\nrequiring specification of each entry.\n| **Note:** `ML.RECOMMEND` can generate large outputs. Consider saving the output to a table for analysis.\n\nSyntax\n------\n\n```sql\nML.RECOMMEND(\n MODEL `PROJECT_ID.DATASET.MODEL_NAME`,\n [, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }]\n [, STRUCT(TRIAL_ID AS trial_id)])\n```\n\n### Arguments\n\n`ML.RECOMMEND` takes the following arguments:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the project that contains the resource.\n- \u003cvar translate=\"no\"\u003eDATASET\u003c/var\u003e: the dataset that contains the resource.\n- \u003cvar translate=\"no\"\u003eMODEL\u003c/var\u003e: the name of the model.\n- \u003cvar translate=\"no\"\u003eTABLE\u003c/var\u003e: The name of the input table that contains the\n user and item data.\n\n If you specify input data by using either the `TABLE` or `QUERY_STATEMENT`\n argument, the user and item columns must match the user and item columns in\n the model, and their types must be compatible according to\n BigQuery\n [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n\n If the input table does not contain both the user and item column, the input\n table can only contain one column. If the table contains both the user and\n item columns, then the non-user or item columns are passed through and\n available for use in the query.\n- \u003cvar translate=\"no\"\u003eQUERY_STATEMENT\u003c/var\u003e: The GoogleSQL query that is\n used to generate the evaluation data. For the supported SQL syntax for the\n `QUERY_STATEMENT` clause in GoogleSQL, see\n [Query syntax](/bigquery/docs/reference/standard-sql/query-syntax#sql_syntax).\n\n If you specify input data by using either the `TABLE` or `QUERY_STATEMENT`\n argument, the user and item columns must match the user and item columns in\n the model, and their types must be compatible according to\n BigQuery\n [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n- \u003cvar translate=\"no\"\u003eTRIAL_ID\u003c/var\u003e: an `INT64` value that identifies the\n hyperparameter tuning trial that you want the function to evaluate. The\n function uses the optimal trial by default. Only specify this argument if you\n ran hyperparameter tuning when creating the model.\n\nOutput\n------\n\n`ML.RECOMMEND` outputs at least 3 columns for all cases; the `user` column, the\n`item` column and a column for predicted recommendations.\n\nThe output of `ML.RECOMMEND` is computed as follows:\n\n- If both the user and item columns are in the input data, then `ML.RECOMMEND` returns a rating for each user-item pair.\n- If only the user or only the item is specified (for example, if the table identified by the `table` argument only contains the user column), then all the item ratings for every user in the table are outputted.\n- If either the user or item feature was not in the training dataset, the rating that is returned is the intercept of the feature that was provided, either item or user, added to the `global__intercept__` offset. For example, `global__intercept__ + __intercept__['user_a']`.\n- If input data is specified but does not provide either the user or item column, `ML.RECOMMEND` returns an error.\n- If no input data is specified, `ML.RECOMMEND` outputs the ratings for every user and item combination seen during training.\n\nIf the model was trained with `feedback_type=EXPLICIT`, a user column called\n`user`, and an item column called `item`, then `ML.RECOMMEND` returns the\nfollowing columns:\n\n- `user`: a `STRING` value containing the user data.\n- `item`: a `STRING` value containing the item data\n- `predicted_\u003crating_col_name\u003e`: a `FLOAT64` value that contains the rating for each user-item pair. Because the input ratings from training are assumed to be explicit feedback, the predicted ratings are approximately in the range of the original input, although ratings outside the range are also normal.\n\nIf the model was trained with `feedback_type=IMPLICIT`, a user column called\n`user`, and an item column called `item`, then `ML.RECOMMEND` returns the\nfollowing columns:\n\n- `user`: a `STRING` value containing the user data.\n- `item`: a `STRING` value containing the item data\n- `predicted_\u003crating_col_name\u003e_confidence`: a `FLOAT64` value that contains the relative confidence for each user-item pair. The input ratings from training are assumed to be a proxy for user confidence. Therefore, if the model has converged, the predicted confidences lie between approximately 0 and 1 (but can lie just outside that range). If the model hasn't converged, the predicted confidences can be any value. If your model isn't converging and your ratings are very large, try decreasing the `WALS_ALPHA` value that's specified in the `CREATE MODEL` statement for the model. If your model isn't converging and your ratings are very small, try increasing the `WALS_ALPHA` value.\n\nExamples\n--------\n\nThe following examples show how to use the `ML.RECOMMEND` function.\n\n### No input data\n\nThe following example generates predicted ratings for every\nuser-item pair in the inputs of `mymodel` because there is no input data\nspecified. \n\n```sql\nSELECT\n *\nFROM\n ML.RECOMMEND(MODEL `mydataset.mymodel`)\n```\n\n### With input data\n\nThe following example generates predicted ratings for each user-item row in\n`mydataset.mytable` assuming that `mydataset.mymodel` was trained using the user\ncolumn `user` and item column `item`. \n\n```sql\nSELECT\n *\nFROM\n ML.RECOMMEND(MODEL `mydataset.mymodel`,\n (\n SELECT\n user,\n item\n FROM\n `mydataset.mytable`))\n```\n\nWhat's next\n-----------\n\n- For information about model inference, see [Model inference overview](/bigquery/docs/inference-overview).\n- For information about the supported SQL statements and functions for each model type, see [End-to-end user journey for each model](/bigquery/docs/e2e-journey)."]]