- 2.27.0 (latest)
- 2.26.0
- 2.25.0
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
LinearRegression(
    *,
    optimize_strategy: typing.Literal[
        "auto_strategy", "batch_gradient_descent", "normal_equation"
    ] = "auto_strategy",
    fit_intercept: bool = True,
    l1_reg: typing.Optional[float] = None,
    l2_reg: float = 0.0,
    max_iterations: int = 20,
    warm_start: bool = False,
    learning_rate: typing.Optional[float] = None,
    learning_rate_strategy: typing.Literal["line_search", "constant"] = "line_search",
    tol: float = 0.01,
    ls_init_learning_rate: typing.Optional[float] = None,
    calculate_p_values: bool = False,
    enable_global_explain: bool = False
)Ordinary least squares Linear Regression.
LinearRegression fits a linear model with coefficients w = (w1, ..., wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.
Examples:
>>> from bigframes.ml.linear_model import LinearRegression
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> X = bpd.DataFrame({                 "feature0": [20, 21, 19, 18],                 "feature1": [0, 1, 1, 0],                 "feature2": [0.2, 0.3, 0.4, 0.5]})
>>> y = bpd.DataFrame({"outcome": [0, 0, 1, 1]})
>>> # Create the linear model
>>> model = LinearRegression()
>>> model.fit(X, y)
LinearRegression()
>>> # Score the model
>>> score = model.score(X, y)
>>> print(score) # doctest:+SKIP
    mean_absolute_error  mean_squared_error  mean_squared_log_error          0             0.022812            0.000602                 0.00035
    median_absolute_error  r2_score  explained_variance
0               0.015077  0.997591            0.997591
| Parameters | |
|---|---|
| Name | Description | 
| optimize_strategy | str, default "auto_strategy"The strategy to train linear regression models. Possible values are "auto_strategy", "batch_gradient_descent", "normal_equation". Default to "auto_strategy". | 
| fit_intercept | bool, default TrueDefault  | 
| l1_reg | float or None, default NoneThe amount of L1 regularization applied. Default to None. Can't be set in "normal_equation" mode. If unset, value 0 is used. | 
| l2_reg | float, default 0.0The amount of L2 regularization applied. Default to 0. | 
| max_iterations | int, default 20The maximum number of training iterations or steps. Default to 20. | 
| warm_start | bool, default FalseDetermines whether to train a model with new training data, new model options, or both. Unless you explicitly override them, the initial options used to train the model are used for the warm start run. Default to False. | 
| learning_rate | float or None, default NoneThe learn rate for gradient descent when learning_rate_strategy='constant'. If unset, value 0.1 is used. If learning_rate_strategy='line_search', an error is returned. | 
| learning_rate_strategy | str, default "line_search"The strategy for specifying the learning rate during training. Default to "line_search". | 
| tol | float, default 0.01The minimum relative loss improvement that is necessary to continue training when EARLY_STOP is set to true. For example, a value of 0.01 specifies that each iteration must reduce the loss by 1% for training to continue. Default to 0.01. | 
| ls_init_learning_rate | float or None, default NoneSets the initial learning rate that learning_rate_strategy='line_search' uses. This option can only be used if line_search is specified. If unset, value 0.1 is used. | 
| calculate_p_values | bool, default FalseSpecifies whether to compute p-values and standard errors during training. Default to False. | 
| enable_global_explain | bool, default FalseWhether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False. | 
Methods
__repr__
__repr__()Print the estimator's constructor with all non-default parameter values.
fit
fit(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    X_eval: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
    y_eval: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
) -> bigframes.ml.base._TFit linear model.
| Parameters | |
|---|---|
| Name | Description | 
| X | bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.SeriesSeries or DataFrame of shape (n_samples, n_features). Training data. | 
| y | bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.SeriesSeries or DataFrame of shape (n_samples,) or (n_samples, n_targets). Target values. Will be cast to X's dtype if necessary. | 
| X_eval | bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.SeriesSeries or DataFrame of shape (n_samples, n_features). Evaluation data. | 
| y_eval | bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.SeriesSeries or DataFrame of shape (n_samples,) or (n_samples, n_targets). Evaluation target values. Will be cast to X_eval's dtype if necessary. | 
| Returns | |
|---|---|
| Type | Description | 
| LinearRegression | Fitted estimator. | 
get_params
get_params(deep: bool = True) -> typing.Dict[str, typing.Any]Get parameters for this estimator.
| Parameter | |
|---|---|
| Name | Description | 
| deep | bool, default TrueDefault  | 
| Returns | |
|---|---|
| Type | Description | 
| Dictionary | A dictionary of parameter names mapped to their values. | 
global_explain
global_explain() -> bigframes.dataframe.DataFrameProvide explanations for an entire linear regression model.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame | Dataframes containing feature importance values and corresponding attributions, designed to provide a global explanation of feature influence. | 
predict
predict(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
) -> bigframes.dataframe.DataFramePredict using the linear model.
| Parameter | |
|---|---|
| Name | Description | 
| X | bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.SeriesSeries or DataFrame of shape (n_samples, n_features). Samples. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.dataframe.DataFrame | DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values. | 
predict_explain
predict_explain(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    *,
    top_k_features: int = 5
) -> bigframes.dataframe.DataFrameExplain predictions for a linear regression model.
| Parameter | |
|---|---|
| Name | Description | 
| top_k_features | int, default 5an INT64 value that specifies how many top feature attribution pairs are generated for each row of input data. The features are ranked by the absolute values of their attributions. By default, top_k_features is set to 5. If its value is greater than the number of features in the training data, the attributions of all features are returned. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame | The predicted DataFrames with explanation columns. | 
register
register(vertex_ai_model_id: typing.Optional[str] = None) -> bigframes.ml.base._TRegister the model to Vertex AI.
After register, go to the Google Cloud console (https://console.cloud.google.com/vertex-ai/models) to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.
| Parameter | |
|---|---|
| Name | Description | 
| vertex_ai_model_id | Optional[str], default NoneOptional string id as model id in Vertex. If not set, will default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation. | 
score
score(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
) -> bigframes.dataframe.DataFrameCalculate evaluation metrics of the model.
| Parameters | |
|---|---|
| Name | Description | 
| X | bigframes.dataframe.DataFrame or bigframes.series.SeriesSeries or DataFrame of shape (n_samples, n_features). Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape  | 
| y | bigframes.dataframe.DataFrame or bigframes.series.SeriesSeries or DataFrame of shape (n_samples,) or (n_samples, n_outputs). True values for  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.dataframe.DataFrame | A DataFrame of the evaluation result. | 
to_gbq
to_gbq(
    model_name: str, replace: bool = False
) -> bigframes.ml.linear_model.LinearRegressionSave the model to BigQuery.
| Parameters | |
|---|---|
| Name | Description | 
| model_name | strThe name of the model. | 
| replace | bool, default FalseDetermine whether to replace if the model already exists. Default to False. | 
| Returns | |
|---|---|
| Type | Description | 
| LinearRegression | Saved model. |