xrspatial.rasterize.rasterize#

xrspatial.rasterize.rasterize(geometries: Any, width: int | None = None, height: int | None = None, bounds: Tuple[float, float, float, float] | None = None, column: str | None = None, columns: Sequence[str] | None = None, fill: float = nan, dtype: str | dtype | type | None = None, all_touched: bool = False, gpu: bool = False, name: str = 'rasterize', resolution: float | Tuple[float, float] | None = None, like: DataArray | None = None, merge: str | Callable = 'last', chunks: int | Tuple[int, int] | None = None, max_pixels: int = 1000000000, check_crs: bool = True, use_cuda: bool | None = None) DataArray[source]#

Rasterize vector geometries into a 2D DataArray.

Converts geometries from a GeoDataFrame or a list of (geometry, value) pairs into a regularly gridded raster. No GDAL dependency.

Supported geometry types:

  • Polygon / MultiPolygon – scanline fill

  • LineString / LinearRing / MultiLineString – Bresenham line rasterization (a LinearRing burns its closed boundary)

  • Point / MultiPoint – direct pixel burn

  • GeometryCollection – recursively unpacked

A non-empty geometry of any other type is dropped with a warning rather than silently discarded. Likewise, a geometry containing non-finite coordinates (NaN or infinity) has no defined location on the raster grid and is dropped with a warning.

Parameters:
  • geometries (GeoDataFrame or iterable of (geometry, value)) – Input vector data. If a GeoDataFrame, the column parameter selects which attribute to burn into the raster (defaults to the first numeric column). If an iterable, each element must be a (shapely.geometry, numeric_value) pair.

  • width (int, optional) – Number of columns in the output raster. Required unless resolution or like is given. Must be passed together with height: a partial override (only one of the two) is rejected with ValueError rather than silently filling the missing dimension from like or resolution.

  • height (int, optional) – Number of rows in the output raster. Required unless resolution or like is given. Must be passed together with width (see above).

  • bounds (tuple of (xmin, ymin, xmax, ymax), optional) – Geographic extent of the output raster. Inferred from the geometries (or like) if omitted.

  • column (str, optional) –

    Name of the GeoDataFrame column whose values are burned into the raster. Ignored when geometries is a list of pairs. Mutually exclusive with columns.

    A string or categorical column is label-encoded: each distinct label gets an integer code 0..N-1 and the result is an int32 band with a -1 nodata sentinel (unless dtype / fill are passed explicitly). Plain string/object columns are ordered lexically; an existing pandas Categorical keeps its declared order. The label map is stored on the result as attrs['category_names'] (index == pixel code) plus an auto-generated attrs['category_colors'] (one RGBA per class). to_geotiff writes these to a PAM <file>.aux.xml sidecar so GDAL/QGIS display the class names, and open_geotiff reads them back.

  • columns (list of str, optional) – Names of multiple GeoDataFrame columns to pass as a properties array to the merge function. Mutually exclusive with column. When given, the merge function receives a 1D float64 array of length len(columns) as its props argument.

  • fill (float, default np.nan) – Value for pixels not covered by any geometry. When dtype resolves to an integer or boolean type (either via the dtype argument or via like), a fill the dtype cannot represent exactly is rejected with ValueError. This covers NaN into an integer dtype (which would land on a platform sentinel), an out-of-range integer (fill=-9999 into uint8 wraps to 241), and any non-False value into bool (which collapses to True). Without the guard the burned array and its emitted nodata / _FillValue attrs would disagree. Pass a fill the dtype can hold (e.g. fill=0 or fill=-9999 for integers, fill=False for bool) or use a floating dtype.

  • dtype (numpy dtype, dtype name, or scalar type, optional) – Data type of the output array. Accepts anything np.dtype() can parse: an np.dtype instance, a dtype name string ('int32'), or a numpy scalar type (np.int32). Defaults to np.float64, or to the dtype of like if provided. When this resolves to an integer type, burn values are validated against the float64 safe integer range: the rasterizer computes in float64, so a value with magnitude above 2**53 - 1 cannot be cast back to an exact integer (e.g. 2**53 + 1 would land on 2**53). Such a value is rejected with ValueError rather than silently rounded; use a floating dtype to burn identifiers larger than that. Non-finite burn values (NaN, inf) are likewise rejected for integer and bool dtypes – the final cast would otherwise land them on a platform sentinel (NaN becomes -2147483648 for int32, True for bool). merge='count' is exempt because it burns overlap counts, never the property values.

  • all_touched (bool, default False) – If True, every pixel a geometry passes through is burned, using a supercover (Amanatides & Woo) grid traversal: polygon boundaries in addition to the normal center-fill, and every cell a line crosses rather than only the Bresenham path. Output matches rasterio.features.rasterize(..., all_touched=True) pixel-for-pixel up to rasterization tie-breaking on shared edges and exact cell-corner crossings. If False, only pixels whose centers fall inside a polygon (or on the Bresenham line) are burned.

  • gpu (bool, default False) – If True, use the CuPy/CUDA backend. Same convention as open_geotiff(gpu=True); combine with chunks for the dask+cupy backend.

  • name (str, default 'rasterize') – Name for the output DataArray.

  • resolution (float or (x_res, y_res), optional) – Pixel size. Must be finite and > 0. When given with bounds, computes width and height automatically. A single float uses the same resolution for both axes. The requested cell size is honored exactly. When the bounds don’t divide evenly by the resolution, the output grid extends past the original xmax / ymin (anchored at xmin / ymax) rather than shrinking the cell to fit. Matches the rasterio.transform.from_origin convention.

  • like (xr.DataArray, optional) – Template raster. Width, height, bounds, and dtype are copied from this array. Bounds and dtype can be overridden one at a time; width and height must be overridden together (passing only one raises ValueError). Must have uniformly spaced x and y dim coords – the rasterizer only writes to a regular grid, so a non-uniform like is rejected with ValueError rather than silently producing pixel labels that don’t match the data. Both descending (top-down, ymax first) and ascending (bottom-up, ymin first) y coords are supported – the burned rows are flipped to match so result.sel(y=...) lines up with the geometry in world coordinates either way, and result.y always equals like.y exactly.

  • merge (str or callable, default 'last') –

    How to combine values when geometries overlap.

    Built-in modes (pass as string):

    • 'last' – last geometry in input order wins

    • 'first' – first geometry wins

    • 'max' / 'min' – keep the larger / smaller value

    • 'sum' – add values together

    • 'count' – count overlapping geometries

    For 'sum' and 'count', each geometry contributes at most once per pixel: the connecting vertex of consecutive line segments and the overlap between a polygon’s interior fill and its all_touched boundary are deduplicated, matching rasterio’s MergeAlg.add (which yields 1 per geometry per pixel even for a self-crossing line). Individual points are not deduplicated – a MultiPoint with repeated coordinates burns once per point, also matching rasterio.

    Custom merge function (pass a callable):

    For CPU backends, pass a @ngjit-decorated function. For GPU backends (gpu=True), pass a @numba.cuda.jit(device=True) function. Signature:

    merge_fn(pixel, props, is_first) -> float64
    
    • pixel: current pixel value (fill value on first write)

    • props: 1D float64 array of property values for the geometry

    • is_first: 1 on first write to this pixel, 0 otherwise

    Warning

    On the GPU backends (gpu=True, with or without chunks) a custom callable does not use CUDA atomics. Its per-pixel update is a non-atomic read-modify-write, so when geometries overlap, several threads may update the same pixel at once and the result for those pixels is nondeterministic – it can vary between runs and need not match the CPU backend. The built-in string merges ('sum', 'count', 'min', 'max', 'first', 'last') do use atomics and stay deterministic over overlap; pass one of those if you need a stable result where geometries overlap on the GPU. Calling rasterize with a callable merge and gpu=True emits a UserWarning to this effect.

  • chunks (int or (int, int), optional) – If given, use the dask backend and split the output raster into tiles of this size (row_chunk, col_chunk). Both axes must be > 0. A single int uses the same chunk size for both axes. Combined with gpu to select dask+numpy vs dask+cupy.

  • max_pixels (int, default 1_000_000_000) – Safety cap on the resolved output size (width * height). The function raises ValueError before any host or device allocation if the cap is exceeded. Raise this explicitly when rasterizing a legitimately large grid.

  • check_crs (bool, default True) – When geometries is a GeoDataFrame with a .crs and like is a template that carries a CRS (via attrs['crs'], attrs['crs_wkt'], a spatial_ref coord, or rio.crs), compare the two and raise ValueError if they disagree. This prevents silently burning, say, an EPSG:4326 GeoDataFrame onto an EPSG:3857 template and handing back a raster mislabeled with the template CRS. The geometries are never reprojected; reproject them yourself to match the template. Pass check_crs=False to skip the comparison (the output still inherits the template CRS). The check is a no-op when either side lacks a CRS.

  • use_cuda (bool, optional) – Deprecated alias for gpu; emits a DeprecationWarning when passed. Passing both gpu=True and use_cuda raises TypeError.

Returns:

2D raster with dims ('y', 'x'). When geometries is a GeoDataFrame with a .crs and neither like nor its spatial_ref coord supplies a CRS, the geometry CRS is emitted on the output (attrs['crs'] as an EPSG int when one resolves, else attrs['crs_wkt'] as WKT).

Return type:

xr.DataArray

Examples

>>> from shapely.geometry import box
>>> result = rasterize([(box(0, 0, 5, 5), 1.0)],
...                    width=10, height=10)

>>> # Using resolution instead of width/height:
>>> result = rasterize(gdf, resolution=0.5,
...                    bounds=(0, 0, 10, 10), column='value')

>>> # Match an existing raster grid:
>>> zones = rasterize(gdf, like=elevation, column='zone')

>>> # Sum overlapping values:
>>> density = rasterize(gdf, width=100, height=100,
...                     column='pop', merge='sum', fill=0)