-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Author of Proposal:
Reason or Problem
Gray-Level Co-occurrence Matrix (GLCM) texture features (Haralick et al. 1973) are standard in remote sensing classification. They capture spatial patterns that spectral bands miss: forest vs. cropland, urban vs. bare soil, water vs. shadow, all based on texture rather than color.
The Python option is scikit-image's graycomatrix / graycoprops, which operates on single NumPy arrays with no Dask chunking or GPU support. A sliding GLCM window over a 10000x10000 satellite scene on CPU is painfully slow.
Proposal
Design:
Add a glcm_texture function that computes GLCM-derived features over a sliding window. The function quantizes the input raster to levels gray levels (default 64), builds the co-occurrence matrix for each window position, and returns one or more texture metrics.
Supported metrics (following Haralick 1973):
contrast-- intensity contrast between neighboring pixelsdissimilarity-- mean absolute differencehomogeneity-- inverse difference momentenergy-- sum of squared GLCM entries (angular second moment)correlation-- linear dependency of gray levelsentropy-- randomness of the co-occurrence distribution
The implementation uses the established focal pattern:
- NumPy:
@ngjitdouble loop over the window, accumulate into alevels x levelsGLCM matrix per cell - CuPy:
@cuda.jitkernel with shared memory for the per-thread GLCM matrix (fits in shared mem for levels <= 64) - Dask:
map_overlapwithdepth=window_size//2 - Dask+CuPy:
map_overlapdispatching to the CuPy kernel
New file: xrspatial/glcm.py
Usage:
from xrspatial import glcm_texture
# single metric
contrast = glcm_texture(band, metric='contrast', window_size=7, levels=64)
# multiple metrics at once (avoids rebuilding the GLCM)
textures = glcm_texture(band, metric=['contrast', 'homogeneity', 'entropy'],
window_size=7, levels=64, distance=1, angle=0)Value: GLCM texture comes up constantly in land cover classification and geological mapping. scikit-image can compute it but not in a tiled or GPU-accelerated way. A Numba/CUDA version with map_overlap tiling would make multi-gigapixel imagery practical without leaving xarray.
Stakeholders and Impacts
Remote sensing analysts, land cover mapping workflows. Adds a new public function and source file. No changes to existing code. Should be registered in the .xrs accessor.
Drawbacks
- GLCM computation is O(window_size^2 * levels^2) per cell. The
levelsparameter needs to stay reasonable (32-64) or it blows up. - The
angleparameter (0, 45, 90, 135 degrees) affects results. Users unfamiliar with GLCM may not know which angle to pick. Defaulting to the mean over all 4 angles would help. - GPU shared memory limits the maximum
levelsvalue. At levels=64, the GLCM matrix is 64644 = 16KB per thread block, which fits fine. Levels=256 would not.
Alternatives
- Wrap scikit-image's
graycomatrix. No GPU or Dask support, and it returns the raw GLCM rather than per-pixel texture maps. - Gabor filters as a texture proxy. Faster but captures different (frequency-domain) information.
- LBP (Local Binary Pattern) as an alternative texture descriptor. Simpler but less informative for many remote sensing applications.
Unresolved Questions
- Should the function support multiple
distancevalues (e.g., 1, 2, 3 pixels) in a single call? - Should the default be a single angle or the mean over all four standard angles?
- Should quantization live inside
glcm_textureor be a separate utility?