xrspatial.classify.percentiles#

xrspatial.classify.percentiles(agg: DataArray, pct: List | None = None, num_sample: int | None = 20000, name: str | None = 'percentiles') DataArray[source]#

Classify data based on percentile breakpoints.

Parameters:
  • agg (xr.DataArray or xr.Dataset) – 2D NumPy, CuPy, NumPy-backed Dask, or CuPy-backed Dask array of values to be classified.

  • pct (list of float, default=[1, 10, 50, 90, 99]) – Percentile values to use as breakpoints.

  • num_sample (int or None, default=20000) – Number of sample data points used to compute percentile breakpoints. For dask-backed arrays the sample is drawn lazily to avoid materialising the entire array into RAM. None means use all data (safe for numpy/cupy, automatically capped for dask).

  • name (str, default='percentiles') – Name of output aggregate array.

Returns:

percentiles_agg – 2D aggregate array of percentile classifications. All other input attributes are preserved. If agg is a Dataset, returns a Dataset with each variable classified independently.

Return type:

xr.DataArray or xr.Dataset

References

Examples

>>> import numpy as np
>>> import xarray as xr
>>> from xrspatial.classify import percentiles
>>> elevation = np.array([
    [np.nan,  1.,  2.,  3.,  4.],
    [ 5.,  6.,  7.,  8.,  9.],
    [10., 11., 12., 13., 14.],
    [15., 16., 17., 18., 19.],
    [20., 21., 22., 23., np.inf]
])
>>> agg_numpy = xr.DataArray(elevation, attrs={'res': (10.0, 10.0)})
>>> numpy_percentiles = percentiles(agg_numpy)
>>> print(numpy_percentiles)
<xarray.DataArray 'percentiles' (dim_0: 5, dim_1: 5)>
array([[nan,  0.,  1.,  1.,  2.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  3.,  3.],
       [ 3.,  3.,  3.,  3.,  3.],
       [ 3.,  4.,  4.,  5., nan]], dtype=float32)
Dimensions without coordinates: dim_0, dim_1
Attributes:
    res:      (10.0, 10.0)