xrspatial.rasterize.rasterize#
- xrspatial.rasterize.rasterize(geometries: Any, width: int | None = None, height: int | None = None, bounds: Tuple[float, float, float, float] | None = None, column: str | None = None, columns: Sequence[str] | None = None, fill: float = nan, dtype: str | dtype | type | None = None, all_touched: bool = False, gpu: bool = False, name: str = 'rasterize', resolution: float | Tuple[float, float] | None = None, like: DataArray | None = None, merge: str | Callable = 'last', chunks: int | Tuple[int, int] | None = None, max_pixels: int = 1000000000, check_crs: bool = True, use_cuda: bool | None = None) DataArray[source]#
Rasterize vector geometries into a 2D DataArray.
Converts geometries from a GeoDataFrame or a list of
(geometry, value)pairs into a regularly gridded raster. No GDAL dependency.Supported geometry types:
Polygon / MultiPolygon – scanline fill
LineString / LinearRing / MultiLineString – Bresenham line rasterization (a LinearRing burns its closed boundary)
Point / MultiPoint – direct pixel burn
GeometryCollection – recursively unpacked
A non-empty geometry of any other type is dropped with a warning rather than silently discarded. Likewise, a geometry containing non-finite coordinates (NaN or infinity) has no defined location on the raster grid and is dropped with a warning.
- Parameters:
geometries (GeoDataFrame or iterable of (geometry, value)) – Input vector data. If a GeoDataFrame, the
columnparameter selects which attribute to burn into the raster (defaults to the first numeric column). If an iterable, each element must be a(shapely.geometry, numeric_value)pair.width (int, optional) – Number of columns in the output raster. Required unless
resolutionorlikeis given. Must be passed together withheight: a partial override (only one of the two) is rejected withValueErrorrather than silently filling the missing dimension fromlikeorresolution.height (int, optional) – Number of rows in the output raster. Required unless
resolutionorlikeis given. Must be passed together withwidth(see above).bounds (tuple of (xmin, ymin, xmax, ymax), optional) – Geographic extent of the output raster. Inferred from the geometries (or
like) if omitted.column (str, optional) –
Name of the GeoDataFrame column whose values are burned into the raster. Ignored when
geometriesis a list of pairs. Mutually exclusive withcolumns.A string or categorical column is label-encoded: each distinct label gets an integer code
0..N-1and the result is anint32band with a-1nodata sentinel (unlessdtype/fillare passed explicitly). Plain string/object columns are ordered lexically; an existing pandasCategoricalkeeps its declared order. The label map is stored on the result asattrs['category_names'](index == pixel code) plus an auto-generatedattrs['category_colors'](one RGBA per class).to_geotiffwrites these to a PAM<file>.aux.xmlsidecar so GDAL/QGIS display the class names, andopen_geotiffreads them back.columns (list of str, optional) – Names of multiple GeoDataFrame columns to pass as a properties array to the merge function. Mutually exclusive with
column. When given, the merge function receives a 1D float64 array of lengthlen(columns)as itspropsargument.fill (float, default np.nan) – Value for pixels not covered by any geometry. When
dtyperesolves to an integer or boolean type (either via thedtypeargument or vialike), a fill the dtype cannot represent exactly is rejected withValueError. This covers NaN into an integer dtype (which would land on a platform sentinel), an out-of-range integer (fill=-9999intouint8wraps to 241), and any non-False value intobool(which collapses toTrue). Without the guard the burned array and its emittednodata/_FillValueattrs would disagree. Pass a fill the dtype can hold (e.g.fill=0orfill=-9999for integers,fill=Falsefor bool) or use a floating dtype.dtype (numpy dtype, dtype name, or scalar type, optional) – Data type of the output array. Accepts anything
np.dtype()can parse: annp.dtypeinstance, a dtype name string ('int32'), or a numpy scalar type (np.int32). Defaults to np.float64, or to the dtype oflikeif provided. When this resolves to an integer type, burn values are validated against the float64 safe integer range: the rasterizer computes in float64, so a value with magnitude above2**53 - 1cannot be cast back to an exact integer (e.g.2**53 + 1would land on2**53). Such a value is rejected withValueErrorrather than silently rounded; use a floating dtype to burn identifiers larger than that. Non-finite burn values (NaN, inf) are likewise rejected for integer and bool dtypes – the final cast would otherwise land them on a platform sentinel (NaN becomes-2147483648for int32,Truefor bool).merge='count'is exempt because it burns overlap counts, never the property values.all_touched (bool, default False) – If True, every pixel a geometry passes through is burned, using a supercover (Amanatides & Woo) grid traversal: polygon boundaries in addition to the normal center-fill, and every cell a line crosses rather than only the Bresenham path. Output matches
rasterio.features.rasterize(..., all_touched=True)pixel-for-pixel up to rasterization tie-breaking on shared edges and exact cell-corner crossings. If False, only pixels whose centers fall inside a polygon (or on the Bresenham line) are burned.gpu (bool, default False) – If True, use the CuPy/CUDA backend. Same convention as
open_geotiff(gpu=True); combine withchunksfor the dask+cupy backend.name (str, default 'rasterize') – Name for the output DataArray.
resolution (float or (x_res, y_res), optional) – Pixel size. Must be finite and
> 0. When given withbounds, computeswidthandheightautomatically. A single float uses the same resolution for both axes. The requested cell size is honored exactly. When the bounds don’t divide evenly by the resolution, the output grid extends past the originalxmax/ymin(anchored atxmin/ymax) rather than shrinking the cell to fit. Matches therasterio.transform.from_originconvention.like (xr.DataArray, optional) – Template raster. Width, height, bounds, and dtype are copied from this array. Bounds and dtype can be overridden one at a time; width and height must be overridden together (passing only one raises
ValueError). Must have uniformly spacedxandydim coords – the rasterizer only writes to a regular grid, so a non-uniformlikeis rejected withValueErrorrather than silently producing pixel labels that don’t match the data. Both descending (top-down, ymax first) and ascending (bottom-up, ymin first) y coords are supported – the burned rows are flipped to match soresult.sel(y=...)lines up with the geometry in world coordinates either way, andresult.yalways equalslike.yexactly.merge (str or callable, default 'last') –
How to combine values when geometries overlap.
Built-in modes (pass as string):
'last'– last geometry in input order wins'first'– first geometry wins'max'/'min'– keep the larger / smaller value'sum'– add values together'count'– count overlapping geometries
For
'sum'and'count', each geometry contributes at most once per pixel: the connecting vertex of consecutive line segments and the overlap between a polygon’s interior fill and itsall_touchedboundary are deduplicated, matching rasterio’sMergeAlg.add(which yields 1 per geometry per pixel even for a self-crossing line). Individual points are not deduplicated – a MultiPoint with repeated coordinates burns once per point, also matching rasterio.Custom merge function (pass a callable):
For CPU backends, pass a
@ngjit-decorated function. For GPU backends (gpu=True), pass a@numba.cuda.jit(device=True)function. Signature:merge_fn(pixel, props, is_first) -> float64
pixel: current pixel value (fill value on first write)
props: 1D float64 array of property values for the geometry
is_first: 1 on first write to this pixel, 0 otherwise
Warning
On the GPU backends (
gpu=True, with or withoutchunks) a custom callable does not use CUDA atomics. Its per-pixel update is a non-atomic read-modify-write, so when geometries overlap, several threads may update the same pixel at once and the result for those pixels is nondeterministic – it can vary between runs and need not match the CPU backend. The built-in string merges ('sum','count','min','max','first','last') do use atomics and stay deterministic over overlap; pass one of those if you need a stable result where geometries overlap on the GPU. Callingrasterizewith a callablemergeandgpu=Trueemits aUserWarningto this effect.chunks (int or (int, int), optional) – If given, use the dask backend and split the output raster into tiles of this size
(row_chunk, col_chunk). Both axes must be> 0. A single int uses the same chunk size for both axes. Combined withgputo select dask+numpy vs dask+cupy.max_pixels (int, default 1_000_000_000) – Safety cap on the resolved output size (
width * height). The function raisesValueErrorbefore any host or device allocation if the cap is exceeded. Raise this explicitly when rasterizing a legitimately large grid.check_crs (bool, default True) – When
geometriesis a GeoDataFrame with a.crsandlikeis a template that carries a CRS (viaattrs['crs'],attrs['crs_wkt'], aspatial_refcoord, orrio.crs), compare the two and raiseValueErrorif they disagree. This prevents silently burning, say, an EPSG:4326 GeoDataFrame onto an EPSG:3857 template and handing back a raster mislabeled with the template CRS. The geometries are never reprojected; reproject them yourself to match the template. Passcheck_crs=Falseto skip the comparison (the output still inherits the template CRS). The check is a no-op when either side lacks a CRS.use_cuda (bool, optional) – Deprecated alias for
gpu; emits aDeprecationWarningwhen passed. Passing bothgpu=Trueanduse_cudaraisesTypeError.
- Returns:
2D raster with dims
('y', 'x'). Whengeometriesis a GeoDataFrame with a.crsand neitherlikenor itsspatial_refcoord supplies a CRS, the geometry CRS is emitted on the output (attrs['crs']as an EPSG int when one resolves, elseattrs['crs_wkt']as WKT).- Return type:
xr.DataArray
Examples
>>> from shapely.geometry import box >>> result = rasterize([(box(0, 0, 5, 5), 1.0)], ... width=10, height=10) >>> # Using resolution instead of width/height: >>> result = rasterize(gdf, resolution=0.5, ... bounds=(0, 0, 10, 10), column='value') >>> # Match an existing raster grid: >>> zones = rasterize(gdf, like=elevation, column='zone') >>> # Sum overlapping values: >>> density = rasterize(gdf, width=100, height=100, ... column='pop', merge='sum', fill=0)