-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[Data] Move extension types to ray.data
#59420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -217,10 +217,10 @@ def _convert_batch_type_to_numpy( | |
| ) | ||
| return data | ||
| elif pyarrow is not None and isinstance(data, pyarrow.Table): | ||
| from ray.air.util.tensor_extensions.arrow import ( | ||
| from ray.data._internal.arrow_ops import transform_pyarrow | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this still importing from data? Should this util also be moved to data?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I think you already created a duplicate in python/ray/data, should we just delete this file
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh, the reason this is here is because this utility is still dependent from ray.train.predictors, so this is a duplicate. How about let's just address this in a later PR when we remove predictors? |
||
| from ray.data._internal.tensor_extensions.arrow import ( | ||
| get_arrow_extension_fixed_shape_tensor_types, | ||
| ) | ||
| from ray.data._internal.arrow_ops import transform_pyarrow | ||
|
|
||
| column_values_ndarrays = [] | ||
|
|
||
|
|
@@ -292,7 +292,7 @@ def _cast_ndarray_columns_to_tensor_extension(df: "pd.DataFrame") -> "pd.DataFra | |
| # SettingWithCopyWarning was moved to pd.errors in Pandas 1.5.0. | ||
| SettingWithCopyWarning = pd.errors.SettingWithCopyWarning | ||
|
|
||
| from ray.air.util.tensor_extensions.pandas import ( | ||
| from ray.data._internal.tensor_extensions.pandas import ( | ||
| TensorArray, | ||
| column_needs_tensor_extension, | ||
| ) | ||
|
|
@@ -334,7 +334,7 @@ def _cast_tensor_columns_to_ndarrays(df: "pd.DataFrame") -> "pd.DataFrame": | |
| except AttributeError: | ||
| # SettingWithCopyWarning was moved to pd.errors in Pandas 1.5.0. | ||
| SettingWithCopyWarning = pd.errors.SettingWithCopyWarning | ||
| from ray.air.util.tensor_extensions.pandas import TensorDtype | ||
| from ray.data._internal.tensor_extensions.pandas import TensorDtype | ||
|
|
||
| # Try to convert any tensor extension columns to ndarray columns. | ||
| # TODO(Clark): Optimize this with propagated DataFrame metadata containing a list of | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is now duplicated across
ray.airandray.data, but its not dead code.In a follow-up, I think we should remove this and update the reference in
tensorflow_predictorto use the one defined inray.data.