Skip to content

Feature: multi-stop least-cost pathfinding #935

@brendancol

Description

@brendancol

Summary

a_star_search finds an optimal path between two points on a cost surface. This proposal adds a multi_stop_search function that finds the least-cost route visiting N ordered waypoints, and optionally reorders the waypoints to minimize total travel cost (approximate TSP on a raster).

Motivation

Routing problems with more than two endpoints come up often:

  • Field surveys where you need to hit multiple sample sites across rough terrain
  • Infrastructure corridors connecting several facilities (wells, substations, sensors)
  • Search-and-rescue reaching multiple reported locations
  • Delivery across a friction surface derived from slope or land cover

Right now you'd call a_star_search in a loop, stitch segments by hand, and sort out waypoint ordering yourself. A single function call would save the boilerplate and let us add optimizations later (shared cost caches, early termination).

Proposed API

def multi_stop_search(
    surface: xr.DataArray,
    waypoints: list[tuple],
    barriers: list = [],
    x: str = 'x',
    y: str = 'y',
    connectivity: int = 8,
    snap: bool = False,
    friction: xr.DataArray = None,
    search_radius: int = None,
    optimize_order: bool = False,
) -> xr.DataArray:
    """Find the least-cost path visiting all waypoints in sequence.

    Parameters
    ----------
    surface : xr.DataArray
        2D raster whose values define passability (NaN = impassable).
    waypoints : list of (y, x) or (lat, lon) tuples
        Ordered stops to visit. Minimum 2 (equivalent to a_star_search).
    barriers : list
        Cell values treated as impassable.
    x, y : str
        Coordinate dimension names.
    connectivity : int
        4 or 8 (default 8).
    snap : bool
        Snap each waypoint to the nearest valid cell if it falls on a
        barrier or NaN.
    friction : xr.DataArray, optional
        Cost surface (positive values). Same shape as surface.
    search_radius : int, optional
        Pixel radius bounding each pairwise search.
    optimize_order : bool
        If True, solve for the cheapest visiting order (approximate TSP)
        before routing. The first and last waypoints are kept fixed as
        start and end; interior waypoints are reordered.

    Returns
    -------
    xr.DataArray
        Same shape as surface. Path pixels contain accumulated cost from
        the first waypoint; non-path pixels are NaN.
    """

Return value

Same convention as a_star_search: a DataArray where path cells hold accumulated cost and everything else is NaN. Cost accumulates continuously across segments, so the final waypoint cell holds the total route cost.

Routing metadata goes in result.attrs:

result.attrs['waypoint_order']   # final visit order (indices into input list)
result.attrs['segment_costs']    # list of per-segment costs
result.attrs['total_cost']       # scalar total cost

Implementation approach

MVP: ordered multi-stop

  1. Loop over consecutive waypoint pairs, calling a_star_search for each segment.
  2. Stitch segments into a single output array with continuous cost accumulation (segment K picks up where K-1 left off).
  3. Handle overlap at waypoints, where one segment ends and the next begins.
  4. All four backends (numpy, cupy, dask+numpy, dask+cupy) come for free since each segment delegates to a_star_search.

Then: optimize_order (approximate TSP)

When optimize_order=True:

  1. Build a pairwise cost matrix between all waypoints. N*(N-1)/2 searches (symmetric costs). Two strategies worth benchmarking:
    • Pairwise A* -- simple, works well for small N with search_radius.
    • Run cost_distance once per waypoint to get full cost maps, then read off pairwise costs. Fewer calls but each one is heavier.
  2. Solve ordering: for small N (say <= 12), exact DP via Held-Karp (O(N^2 * 2^N)). For larger N, nearest-neighbor heuristic + 2-opt improvement.
  3. Route the result using the ordered multi-stop logic above.

The TSP solver itself runs on CPU regardless of backend since it operates on a small N x N matrix.

Later

  • Cache intermediate cost maps across segments.
  • Parallelize independent pairwise A* calls.
  • Support round-trip (return to start) and fully open ordering (no fixed endpoints).

Testing plan

  • Verify multi-stop output matches manually stitched a_star_search segments with identical accumulated costs.
  • With optimize_order=True on a small grid (3-5 waypoints), verify the returned order matches a known-optimal solution.
  • Edge cases: 2 waypoints (should match plain a_star_search), duplicate waypoints, waypoints on barriers with snap=True, unreachable waypoints.
  • Cross-backend comparison using general_checks (numpy, cupy, dask, dask+cupy).
  • Confirm accumulated cost increases monotonically and is continuous across segment boundaries.

Open questions

  1. The current proposal fixes the first and last waypoints as start/end for TSP. Should we also support round-trip (return to origin) or fully open ordering?
  2. If one segment is unreachable, should the function raise, skip that segment, or return partial results with NaN gaps?
  3. Should we expose the pairwise cost matrix as a separate return value or utility? It is useful on its own for network analysis.
  4. What N cutoff for exact vs. heuristic TSP? Held-Karp is exact but exponential. 12-15 is the usual threshold.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions