Add polygon rasterization from GeoDataFrame (scanline fill, no GDAL)

## Problem

xarray-spatial can't rasterize vector data. If you have polygons in a GeoDataFrame or GeoParquet file and want to run slope, aspect, or zonal stats on them, you have to rasterize outside the library first.

rasterio would be the obvious choice, but it depends on GDAL, which is a heavy native dependency. This issue proposes a standalone rasterize function using scanline fill, with numpy/numba and cupy backends matching the rest of the library.

## Approach: scanline fill

Scanline fill converts polygons to rasters one row at a time: compute x-intersections with polygon edges, sort them, fill between pairs. Same algorithm GDAL uses internally.

The pipeline:

1. Parse GeoDataFrame into flat vertex/edge arrays (numpy, on host)
2. Build a sorted edge table (numpy)
3. For GPU: transfer the edge table to device (small relative to the output)
4. Scanline fill: `@ngjit` row loop on CPU, or a CUDA kernel with one thread per row on GPU
5. Output an `xr.DataArray` -- numpy on CPU, cupy on GPU (stays on device for downstream ops)

### GPU suitability

Without GDAL doing the work in C, the CPU path is numba -- so the bar for GPU to beat is lower than usual:

- Each raster row is independent: one thread per row, no synchronization
- Control flow is uniform (walk edges, compute intersections, fill). Only divergence is how many edges hit a given row
- Adjacent threads write adjacent memory (coalesced)
- The edge table is read-only and fits in L2 cache
- Output stays on device, so downstream xarray-spatial ops skip the host round-trip

`preview()` gets 100-240x GPU speedup on block reductions for 20K+ rasters on an A6000. Scanline fill has a similar work profile, so large output grids should see comparable gains.

### Why not point-in-polygon?

Testing every pixel against every polygon is O(pixels * polygons) and branch-heavy. Scanline fill is O(rows * edges), and edges are typically orders of magnitude fewer than pixels. Control flow is also more uniform across GPU threads.

## Function signature

```python
def rasterize(
    geometries,          # GeoDataFrame or list of (geometry, value) pairs
    width: int,
    height: int,
    bounds: tuple = None,        # (xmin, ymin, xmax, ymax); inferred from geometries if omitted
    fill: float = np.nan,
    dtype: np.dtype = np.float64,
    all_touched: bool = False,
    name: str = 'rasterize',
) -> xr.DataArray
```

Returns a 2D `xr.DataArray` with y/x coordinates derived from bounds and resolution.

## Scope

This issue covers polygon and MultiPolygon rasterization with numpy + cupy backends, burning a single attribute column.

Line/point support, dask (spatial partitioning per chunk), and multi-attribute 3D output can follow separately.

## Dependencies

- `geopandas` (geometry parsing)
- `shapely` (vertex extraction via `get_coordinates`, `get_rings`)
- No GDAL, no rasterio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add polygon rasterization from GeoDataFrame (scanline fill, no GDAL) #989

Problem

Approach: scanline fill

GPU suitability

Why not point-in-polygon?

Function signature

Scope

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add polygon rasterization from GeoDataFrame (scanline fill, no GDAL) #989

Description

Problem

Approach: scanline fill

GPU suitability

Why not point-in-polygon?

Function signature

Scope

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions