tf.raw_ops.BoostedTreesSparseCalculateBestFeatureSplit
Stay organized with collections
Save and categorize content based on your preferences.
Calculates gains for each feature and returns the best possible split information for the feature.
tf.raw_ops.BoostedTreesSparseCalculateBestFeatureSplit(
node_id_range,
stats_summary_indices,
stats_summary_values,
stats_summary_shape,
l1,
l2,
tree_complexity,
min_node_weight,
logits_dimension,
split_type='inequality',
name=None
)
The split information is the best threshold (bucket id), gains and left/right node contributions per node for each feature.
It is possible that not all nodes can be split on each feature. Hence, the list of possible nodes can differ between the features. Therefore, we return node_ids_list
for each feature, containing the list of nodes that this feature can be used to split.
In this manner, the output is the best split per features and per node, so that it needs to be combined later to produce the best split for each node (among all possible features).
The output shapes are compatible in a way that the first dimension of all tensors are the same and equal to the number of possible split nodes for each feature.
Args |
node_id_range
|
A Tensor of type int32 .
A Rank 1 tensor (shape=[2]) to specify the range [first, last) of node ids to process within stats_summary_list . The nodes are iterated between the two nodes specified by the tensor, as like for node_id in range(node_id_range[0], node_id_range[1]) (Note that the last index node_id_range[1] is exclusive).
|
stats_summary_indices
|
A Tensor of type int32 .
A Rank 2 int64 tensor of dense shape N, 4 for accumulated stats summary (gradient/hessian) per node per bucket for each feature. The second dimension contains node id, feature dimension, bucket id, and stats dim.
stats dim is the sum of logits dimension and hessian dimension, hessian dimension can either be logits dimension if diagonal hessian is used, or logits dimension^2 if full hessian is used.
|
stats_summary_values
|
A Tensor of type float32 .
A Rank 1 float tensor of dense shape N, which supplies the values for each element in summary_indices.
|
stats_summary_shape
|
A Tensor of type int32 .
A Rank 1 float tensor of dense shape [4], which specifies the dense shape of the sparse tensor, which is [num tree nodes, feature dimensions, num buckets, stats dim].
|
l1
|
A Tensor of type float32 .
l1 regularization factor on leaf weights, per instance based.
|
l2
|
A Tensor of type float32 .
l2 regularization factor on leaf weights, per instance based.
|
tree_complexity
|
A Tensor of type float32 .
adjustment to the gain, per leaf based.
|
min_node_weight
|
A Tensor of type float32 .
minimum avg of hessians in a node before required for the node to be considered for splitting.
|
logits_dimension
|
An int that is >= 1 .
The dimension of logit, i.e., number of classes.
|
split_type
|
An optional string from: "inequality" . Defaults to "inequality" .
A string indicating if this Op should perform inequality split or equality split.
|
name
|
A name for the operation (optional).
|
Returns |
A tuple of Tensor objects (node_ids, gains, feature_dimensions, thresholds, left_node_contribs, right_node_contribs, split_with_default_directions).
|
node_ids
|
A Tensor of type int32 .
|
gains
|
A Tensor of type float32 .
|
feature_dimensions
|
A Tensor of type int32 .
|
thresholds
|
A Tensor of type int32 .
|
left_node_contribs
|
A Tensor of type float32 .
|
right_node_contribs
|
A Tensor of type float32 .
|
split_with_default_directions
|
A Tensor of type string .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[],null,["# tf.raw_ops.BoostedTreesSparseCalculateBestFeatureSplit\n\n\u003cbr /\u003e\n\nCalculates gains for each feature and returns the best possible split information for the feature.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.raw_ops.BoostedTreesSparseCalculateBestFeatureSplit`](https://www.tensorflow.org/api_docs/python/tf/raw_ops/BoostedTreesSparseCalculateBestFeatureSplit)\n\n\u003cbr /\u003e\n\n tf.raw_ops.BoostedTreesSparseCalculateBestFeatureSplit(\n node_id_range,\n stats_summary_indices,\n stats_summary_values,\n stats_summary_shape,\n l1,\n l2,\n tree_complexity,\n min_node_weight,\n logits_dimension,\n split_type='inequality',\n name=None\n )\n\nThe split information is the best threshold (bucket id), gains and left/right node contributions per node for each feature.\n\nIt is possible that not all nodes can be split on each feature. Hence, the list of possible nodes can differ between the features. Therefore, we return `node_ids_list` for each feature, containing the list of nodes that this feature can be used to split.\n\nIn this manner, the output is the best split per features and per node, so that it needs to be combined later to produce the best split for each node (among all possible features).\n\nThe output shapes are compatible in a way that the first dimension of all tensors are the same and equal to the number of possible split nodes for each feature.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `node_id_range` | A `Tensor` of type `int32`. A Rank 1 tensor (shape=\\[2\\]) to specify the range \\[first, last) of node ids to process within `stats_summary_list`. The nodes are iterated between the two nodes specified by the tensor, as like `for node_id in range(node_id_range[0], node_id_range[1])` (Note that the last index node_id_range\\[1\\] is exclusive). |\n| `stats_summary_indices` | A `Tensor` of type `int32`. A Rank 2 int64 tensor of dense shape [N, 4](/api_docs/python/tf/raw_ops/N%20specifies%20the%20number%20of%20non-zero%20values) for accumulated stats summary (gradient/hessian) per node per bucket for each feature. The second dimension contains node id, feature dimension, bucket id, and stats dim. stats dim is the sum of logits dimension and hessian dimension, hessian dimension can either be logits dimension if diagonal hessian is used, or logits dimension\\^2 if full hessian is used. |\n| `stats_summary_values` | A `Tensor` of type `float32`. A Rank 1 float tensor of dense shape [N](/api_docs/python/tf/raw_ops/N%20specifies%20the%20number%20of%20non-zero%20values), which supplies the values for each element in summary_indices. |\n| `stats_summary_shape` | A `Tensor` of type `int32`. A Rank 1 float tensor of dense shape \\[4\\], which specifies the dense shape of the sparse tensor, which is \\[num tree nodes, feature dimensions, num buckets, stats dim\\]. |\n| `l1` | A `Tensor` of type `float32`. l1 regularization factor on leaf weights, per instance based. |\n| `l2` | A `Tensor` of type `float32`. l2 regularization factor on leaf weights, per instance based. |\n| `tree_complexity` | A `Tensor` of type `float32`. adjustment to the gain, per leaf based. |\n| `min_node_weight` | A `Tensor` of type `float32`. minimum avg of hessians in a node before required for the node to be considered for splitting. |\n| `logits_dimension` | An `int` that is `\u003e= 1`. The dimension of logit, i.e., number of classes. |\n| `split_type` | An optional `string` from: `\"inequality\"`. Defaults to `\"inequality\"`. A string indicating if this Op should perform inequality split or equality split. |\n| `name` | A name for the operation (optional). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---------------------------------|-------------------------------|\n| A tuple of `Tensor` objects (node_ids, gains, feature_dimensions, thresholds, left_node_contribs, right_node_contribs, split_with_default_directions). ||\n| `node_ids` | A `Tensor` of type `int32`. |\n| `gains` | A `Tensor` of type `float32`. |\n| `feature_dimensions` | A `Tensor` of type `int32`. |\n| `thresholds` | A `Tensor` of type `int32`. |\n| `left_node_contribs` | A `Tensor` of type `float32`. |\n| `right_node_contribs` | A `Tensor` of type `float32`. |\n| `split_with_default_directions` | A `Tensor` of type `string`. |\n\n\u003cbr /\u003e"]]