Dask backend behavior#
When you pass a dask-backed DataArray to an xarray-spatial function, the
result should also be dask-backed so your pipeline stays lazy until you call
.compute(). Most functions do this, but some algorithms need random access
to the full array and have to materialize intermediate results.
This page lists every public function and its laziness level so you can plan dask pipelines without reading source code.
Laziness levels#
Fully lazy – the function returns a dask array without triggering any computation. Safe for arbitrarily large out-of-core datasets.
Partially lazy – the function computes small bounded statistics (scalars, quartiles, a ~20K sample) during setup, then returns a dask array for the main result. The statistics are cheap; the heavy work stays lazy.
Fully materialized – the algorithm needs the entire array in memory
(connected-component labeling, A* search, viewshed sweepline, etc.). The
result may be re-wrapped as dask, but the function calls .compute()
internally. Watch your memory on large inputs.
Terrain metrics#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully lazy |
|
|
Fully lazy |
|
|
Fully lazy |
|
|
Fully lazy |
|
|
Fully lazy |
Uses |
|
Fully lazy |
Uses |
Focal operations#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully lazy |
Iterative |
|
Fully lazy |
|
|
Fully lazy |
Multiple stats via |
|
Partially lazy |
Computes global mean and std, result is dask |
Classification#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully lazy |
|
|
Fully lazy |
|
|
Partially lazy |
Computes percentiles from ~20K sample |
|
Partially lazy |
Computes Jenks breaks from ~20K sample + scalar max |
|
Partially lazy |
Computes scalar min/max |
|
Partially lazy |
Computes scalar mean/std/max |
|
Partially lazy |
Computes O(log N) scalar means |
|
Partially lazy |
Computes percentiles from ~20K sample |
|
Partially lazy |
Computes breaks from ~20K sample |
|
Partially lazy |
Computes scalar quartiles and max |
Normalization#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully lazy |
|
|
Fully lazy |
|
Visibility#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully materialized |
Sweepline algorithm needs random access |
|
Fully materialized |
Extracts 1D transect via |
|
Fully materialized |
Runs multiple viewshed calls |
|
Fully materialized |
Wraps |
Morphology#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully materialized |
Connected-component labeling needs the full array; result re-wrapped as dask |
Proximity#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully materialized |
Distance computation needs full array |
|
Fully materialized |
Nearest-source allocation |
|
Fully materialized |
Direction to nearest source |
Zonal#
Function |
Laziness |
Notes |
|---|---|---|
|
Partially lazy |
Groupby aggregation via dask dataframe |
|
Partially lazy |
Groupby cross-tabulation |
|
Fully lazy |
|
|
Fully materialized |
Connected-component labeling |
|
Fully lazy |
Lazy slicing |
|
Fully lazy |
Lazy slicing |
Pathfinding#
Function |
Laziness |
Notes |
|---|---|---|
|
Fully materialized |
A* needs random access and visited-set tracking |
|
Fully materialized |
Iterative A* |