Skip to content

Conversation

@emmanuelmathot
Copy link

@emmanuelmathot emmanuelmathot commented Dec 12, 2025

Draft for Zarr conventions support according to plan in #882

- Introduced a new module `rioxarray/_convention/zarr.py` for handling Zarr conventions.
- Implemented functions to read and write CRS and spatial metadata according to Zarr specifications.
- Added an enumeration `Convention` in `rioxarray/enum.py` to manage geospatial metadata conventions (CF and Zarr).
- Updated `rioxarray/_options.py` to include convention options and validators.
- Modified `rioxarray/rioxarray.py` to utilize the new convention architecture for reading and writing CRS and transforms.
- Created integration tests for the new convention architecture in `test/integration/test_convention_architecture.py`.
- Added unit tests for convention functionality in `test/test_convention_architecture.py`.
@snowman2
Copy link
Member

If convention is None it should attempt to read both with preference to CF.

- Added entry in history.rst for Zarr spatial and proj conventions support with Convention enum and Zarr-specific methods (corteva#883).
- Included a new section for conventions in index.rst.
- Documented the rioxarray.Convention class in rioxarray.rst.
…hod and improve dimension detection fallback
@snowman2
Copy link
Member

In general, I would like to see only the minimum code needed for rio.write_crs and rio.write_tranform added to this MR. All other code can be discussed later. The reason is to simplify code maintenance and the diff in this PR.

This process may even benefit from breaking down the pieces into separate MRs:

  • MR 1: Add convention option & move CF to _conventions/cf.py
  • MR 2: Reading Zarr CRS, Transform, spatial
  • MR 3: Writing Zarr CRS
  • MR 4: Writing Zarr Transform
  • MR ...: Other Zarr convention components

Dividing it up this way will help each component get the review it needs. But, we can stick to a big MR if you prefer,

return json.loads(projjson_str)


def calculate_spatial_bbox(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use rasterio.transform.array_bounds to simplify this.


When reading geospatial metadata, rioxarray follows this priority order based on the global convention setting:

- **None (default)**: CF conventions first, with Zarr conventions as fallback if explicitly declared
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like both conventions attempted when reading files regardless of the convention setting. The convention setting changing the search priority is fine. This way you can read in files from either format at the same time with the convention setting set.

@snowman2
Copy link
Member

For testing:

  • Please don't use the class format. Instead, use the functional format.

Imports:

  • Please import at the top of the file.


# Handle PROJJSON dict (Zarr proj:projjson convention)
if isinstance(crs_input, dict):
crs_input = CRS.from_json_dict(crs_input)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be handled with CRS.from_user_input below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return attrs_out


def write_conventions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extra function I would prefer not included in this PR.

"- [rio.write_transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_transform)\n",
"- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)"
"- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)\n",
"- [rio.write_zarr_crs()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_zarr_crs) - New Zarr method\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zarr methods are removed.

"- [rio.write_transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_transform)\n",
"- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)"
"- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)\n",
"- [rio.write_zarr_transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_zarr_transform) - Zarr-specific method"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zarr method removed.

"id": "5e08da24",
"metadata": {},
"source": [
"## Zarr-Specific Methods\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have been removed.

from rioxarray._io import open_rasterio
from rioxarray._options import set_options
from rioxarray._show_versions import show_versions
from rioxarray.enum import Convention
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this not be added here and stay in the enum namespace for now.

.. autoclass:: rioxarray.set_options


rioxarray.Convention
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to docs/enum.rst

requires-python = ">=3.12"
dependencies = [
"packaging",
"rasterio>=1.4.3",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't remove the dependency pin.

@@ -1,5 +1,5 @@
default_language_version:
python: python3.12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify this in this PR.

all = [
"scipy"
]
test = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify dependencies in this PR.

Copy link

@alfredoahds alfredoahds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconding Alan's comment about minimizing the MR surface making this easier to review.

Overall this seems good, one nitpick is the recurring series of

if convention is ...:
   ...
elif convention is ...:
   ...

Maybe we can define a common interface for the convention modules and simplify this down? There will be some unused args in the zarr methods but having something like

convention = _get_convention(convention_input or get_option(CONVENTION))
parsed_crs = convention.read_crs(...)

would reduce that duplication/series of ifs.

Comment on lines +283 to +311
# Use CF convention logic for dimension detection
# Also use this as fallback when convention is None
if "x" in self._obj.dims and "y" in self._obj.dims:
self._x_dim = "x"
self._y_dim = "y"
elif "longitude" in self._obj.dims and "latitude" in self._obj.dims:
self._x_dim = "longitude"
self._y_dim = "latitude"
else:
# look for coordinates with CF attributes
for coord in self._obj.coords:
# make sure to only look in 1D coordinates
# that has the same dimension name as the coordinate
if self._obj.coords[coord].dims != (coord,):
continue
if (
self._obj.coords[coord].attrs.get("axis", "").upper() == "X"
) or (
self._obj.coords[coord].attrs.get("standard_name", "").lower()
in ("longitude", "projection_x_coordinate")
):
self._x_dim = coord
elif (
self._obj.coords[coord].attrs.get("axis", "").upper() == "Y"
) or (
self._obj.coords[coord].attrs.get("standard_name", "").lower()
in ("latitude", "projection_y_coordinate")
):
self._y_dim = coord

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on moving this bit into rioxarray._convention.cf.read_spatial_dimensions so there's a consistent interface for the convention options?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Support Zarr Spatial and Geo-Projection Conventions in rioxarray

3 participants