xrspatial.geotiff.to_geotiff#
- xrspatial.geotiff.to_geotiff(data: xr.DataArray | np.ndarray, path: str | BinaryIO, *, crs: int | str | None = None, nodata: float | int | None = None, compression: str = 'zstd', compression_level: int | None = None, tiled: bool = True, tile_size: int = 256, predictor: bool | int = False, cog: bool = False, overview_levels: list[int] | None = None, overview_resampling: str = 'mean', bigtiff: bool | None = None, gpu: bool | None = None, streaming_buffer_bytes: int = 268435456, max_z_error: float = 0.0, photometric: str | int = 'auto', allow_internal_only_jpeg: bool = False, allow_experimental_codecs: bool = False, allow_unparseable_crs: bool = False, drop_rotation: bool = False, pack: bool = False, color_ramp: str | bool | None = None, color_ramp_range: tuple[float, float] | None = None) str | BinaryIO[source]#
Write data as a GeoTIFF or Cloud Optimized GeoTIFF.
Release-contract tier (see
docs/source/reference/release_gate_geotiff.rstanddocs/source/reference/geotiff_release_contract.rst):[stable] Local-file output on an axis-aligned grid with
compressionin{'none', 'deflate', 'lzw', 'packbits', 'zstd'}; CRS / transform / nodata attrs round-trip;bigtiffauto-promotion;cog=True(the IFD-first tiled COG layout with a stable codec, covered bySUPPORTED_FEATURES['writer.cog']).[advanced] Internal overview pyramid generation (
SUPPORTED_FEATURES['writer.overviews']): theoverview_levelsandoverview_resamplingknobs and the pyramid bytes themselves. Also explicitbigtiff=True;photometric=overrides;extra_tagspass-through.[experimental] GPU dispatch via
gpu=True;compressionin{'lerc', 'jpeg2000', 'j2k', 'lz4'}behind the explicitallow_experimental_codecs=Trueopt-in;allow_unparseable_crs=True.[internal-only]
compression='jpeg'behindallow_internal_only_jpeg=True. The produced files do not round-trip through libtiff / GDAL / rasterio; the path exists for xrspatial’s own use and is not part of the externally interoperable surface.Out of scope for this release (allowed to raise): rotated / sheared write support (no
ModelTransformationTagemit path); silent mixed-metadata flattening.
See
xrspatial.geotiff.SUPPORTED_FEATURESfor the full tier map. Per-parameter tier markers below describe the tier the parameter itself carries; a parameter’s effective tier is bounded by the function-level surface above (e.g.[stable]nodatais still only stable when combined with a[stable]codec and options).Dask-backed DataArrays on the CPU path are written in streaming mode: one tile-row at a time, without materialising the full array into RAM. The per-compute budget is sized from the source chunk geometry, so a
map_overlapsource (e.g.slope/aspect) chunked taller than the tile stays withinstreaming_buffer_bytesinstead of pulling several source chunk-rows at once (#3007). COG output (cog=True) still materialises because overviews need the full array.Dask input routed to the GPU writer (auto-detected dask+cupy, or
gpu=Truewith any dask backing) also streams: each tile-row band is computed onto the device, compressed, and released before the next, with the per-compute budget capped bystreaming_buffer_bytes(issue #3166). The bound is on device memory only: the GPU writer still assembles the compressed file in host RAM before writing it out, unlike the CPU streaming path, which writes incrementally to disk. The exception iscog=True, which materialises the full array on device because overview generation needs it, and emits aGeoTIFFFallbackWarningwhen it does.Automatically dispatches to GPU compression when: -
gpu=Trueis passed, or - The input data is CuPy-backed (auto-detected)GPU write uses nvCOMP batch compression (deflate/ZSTD) and keeps the array on device. Falls back to CPU if nvCOMP is not available.
- Parameters:
data (xr.DataArray or np.ndarray) – [stable] 2D raster data.
path (str or binary file-like) – [stable for local file paths; advanced for
io.BytesIOand other in-memory file-likes] Output file path, or any object exposing awritemethod (e.g.io.BytesIO). When a file-like is passed, the encoded TIFF bytes are written to that object once assembly completes.cog=Trueand.vrtoutputs require a string path.crs (int, numpy.integer, str, or None) –
[stable for int EPSG codes; advanced for WKT/PROJ strings] EPSG code (int or numpy integer scalar), WKT string, or PROJ string. If None and data is a DataArray, tries to read from attrs (‘crs’ for EPSG, ‘crs_wkt’ for WKT).
EPSG codes are strongly preferred for interop. The WKT-only path writes
ProjectedCSType/GeographicType= 32767 with the WKT stored inGTCitationGeoKey– libgeotiff and GDAL can round-trip this but many other GeoTIFF readers treat the citation as a free-form name and lose the CRS. AUserWarningis emitted when the WKT-only path is taken.nodata (float, int, or None) – [stable] NoData value.
compression (str) –
[stable for
{'none', 'deflate', 'lzw', 'packbits', 'zstd'}; experimental for{'lerc', 'jpeg2000', 'j2k', 'lz4'}behindallow_experimental_codecs=True; internal-only for'jpeg'behindallow_internal_only_jpeg=True] Codec name. One of'none','deflate','lzw','jpeg','packbits','zstd','lz4','jpeg2000'(alias'j2k'), or'lerc'.Stable codecs (Tier 1, lossless, byte-for-byte round-trip):
'none','deflate','lzw','packbits','zstd'.Experimental codecs (Tier 3):
'lerc','jpeg2000'/'j2k','lz4'. Rejected by default; passallow_experimental_codecs=Trueto opt in. The opt-in emitsGeoTIFFFallbackWarningonce per call so the caller knows the chosen codec carries no cross-backend numerical parity claim and uneven reader support across GDAL versions.'lerc'acceptsmax_z_errorfor lossy compression with a bounded per-pixel error.Internal-only codec (Tier 4):
'jpeg'. Rejected on write by default because the encoder omits the JPEGTables tag and the produced files do not round-trip through libtiff / GDAL / rasterio. Passallow_internal_only_jpeg=Trueto opt in to the internal-reader-only path (see that parameter for details).allow_experimental_codecs=Truedoes NOT cover'jpeg': internal-only is a stricter tier than experimental, and the two flags do not collapse into one switch.compression_level (int or None) –
[stable] Compression effort level. None uses each codec’s default (6 for deflate/zstd). Valid ranges: deflate 1-9, zstd 1-22, lz4 0-16. Codecs without a level concept (lzw, packbits, jpeg) accept any value and ignore it.
Out-of-range levels raise
ValueErroron every backend, including GPU dispatch. On the GPU path the nvCOMP encoder (deflate/zstd tiles) does not expose level control: an explicit level is validated but then ignored, and aUserWarningis emitted. Tiles the GPU writer compresses through the CPU codecs honor the level. Passgpu=Falseif the exact level matters.tiled (bool) – [stable] Use tiled layout (default True). Incompatible with
cog=Truebecause the COG specification requires a tiled internal layout; passingcog=True, tiled=FalseraisesValueError.tile_size (int) – [stable] Tile size in pixels (default 256). Must be a positive multiple of 16 when
tiled=True; this is a TIFF 6 spec requirement on TileWidth and TileLength for broad reader compatibility. Ignored whentiled=False; a warning is emitted if a non-default value is passed alongside strip mode.predictor (bool or int) –
[stable] TIFF predictor. Accepted values:
False,0, or1-> no predictor.Trueor2-> horizontal differencing (good for integer data;Trueand2are exactly equivalent).3-> floating-point predictor (float dtypes only; typically gives better deflate/zstd ratios on float data than predictor 2).
cog (bool) – [stable] Write as Cloud Optimized GeoTIFF. The CPU writer emits the spec-conforming COG layout (IFD-first, tiled, internal overviews, lossless codec) covered by
SUPPORTED_FEATURES['writer.cog']. Requirestiled=True(the default): the COG specification mandates a tiled internal layout, socog=True, tiled=FalseraisesValueError. COG output also materialises the full array, because the overview pyramid needs random access to every pixel; thestreaming_buffer_byteskwarg is a no-op on this path. Customisation of the overview pyramid itself (overview_levels,overview_resampling) is tracked separately as advanced underSUPPORTED_FEATURES['writer.overviews'].overview_levels (list[int] or None) – [advanced] Overview pyramids are an optional COG feature; the decimation factors and resampling choice affect downstream analytics in ways that are not byte-for-byte reproducible across backends. Overview decimation factors relative to full resolution. Each entry must be a power-of-two integer >= 2, and the list must be strictly increasing (e.g.
[2, 4, 8]writes overviews at 1/2, 1/4 and 1/8 of the full resolution). Invalid values raiseValueError. Only used whencog=True. If None andcog=True, levels auto-generate as[2, 4, 8, ...]until the next halving would fall belowtile_size(capped at 8 levels).overview_resampling (str) – [advanced] Resampling method for overviews: ‘mean’ (default), ‘nearest’, ‘min’, ‘max’, ‘median’, ‘mode’, or ‘cubic’.
bigtiff (bool or None) – [advanced] BigTIFF uses 64-bit offsets; older readers that only speak classic TIFF cannot open the output. Force BigTIFF (64-bit offsets). None (default) auto-promotes when the estimated file size would exceed the classic-TIFF 4 GB limit. Matches the same kwarg on
_write_geotiff_gpu.gpu (bool or None) – [experimental] Requires cupy + numba CUDA, plus the optional nvCOMP / nvJPEG / nvJPEG2K libraries for codec-specific acceleration; backend parity with the CPU writer is tested for the Tier 1 codec set only. Force GPU compression. None (default) auto-detects CuPy data, including CuPy-backed dask arrays. Dask-backed input routed to the GPU writer streams one tile-row band at a time unless
cog=True(seestreaming_buffer_bytes).streaming_buffer_bytes (int) – [stable] Soft cap on bytes materialised per dask compute call when streaming a dask-backed DataArray. Defaults to 256 MB. Wide rasters whose tile-row exceeds this budget are split into horizontal segments on the CPU path. On the GPU path the cap bounds the device bytes computed per tile-row band, with a floor of one full-width tile-row (issue #3166). The kwarg is a no-op for in-memory (numpy / CuPy) input and for COG output, which materialises the full array because the overview pyramid needs it; a dask-backed
cog=TrueGPU write emits aGeoTIFFFallbackWarningwhen it materialises.max_z_error (float) – [experimental] Per-pixel error budget for LERC compression.
0.0(default) is lossless; larger values let the encoder approximate values within the bound, producing smaller files at the cost of accuracy bounded byabs(decoded - original) <= max_z_error. Only used whencompression='lerc'(which itself requiresallow_experimental_codecs=True); passing a non-zero value with any other codec raisesValueError.photometric (str or int) –
[advanced] Photometric interpretation for the TIFF Photometric tag (262).
'auto'(default) – MinIsBlack (1) for any band count. ExtraSamples for every band beyond the first is tagged0(unspecified). Multispectral rasters (e.g. R, G, B, NIR) round-trip through this default without being silently labelled as RGB+alpha. Prior versions treated any 3+ band array as RGB and the 4th band as unassociated alpha – the behaviour change is intentional.'rgb'– RGB (Photometric=2). Three colour bands; any additional bands are tagged0(unspecified).'rgba'– RGB with the 4th band tagged as unassociated alpha (TIFF ExtraSamples=2). Requires at least 4 bands.'minisblack'or'miniswhite'– grayscale; multi-band extras tagged0. Signed-integer pixel types with'miniswhite'are rejected withNotImplementedError– xrspatial has no semantically correct inversion for signed MinIsWhite and the silent passthrough that used to happen produced files that disagreed with the on-disk Photometric tag against every standards-compliant TIFF reader. Cast to an unsigned dtype or passphotometric='minisblack'.An
int– written verbatim into Photometric for advanced callers (e.g.3for Palette,5for CMYK).
A user-supplied
extra_tagsentry of(TAG_PHOTOMETRIC, ...)or(TAG_EXTRA_SAMPLES, ...)overrides the writer’s chosen value; only these two tag ids are overridable so other auto-emitted tags such asImageWidthorStripOffsetsremain protected.allow_experimental_codecs (bool) – [experimental] Opt in to the Tier 3 experimental codecs
'lerc','jpeg2000'/'j2k', and'lz4'(defaultFalse). Settingcompression=to one of those codecs without this flag raisesValueErrorwhose message names the flag. With the flag set, the write proceeds and aGeoTIFFFallbackWarningis emitted once per call so the caller knows the chosen codec carries no cross-backend numerical parity claim and uneven reader support across GDAL versions. Does NOT covercompression='jpeg': the internal-only JPEG path keeps its own dedicatedallow_internal_only_jpegflag because internal-only is a stricter tier than experimental. The kwarg is forwarded unchanged to_write_geotiff_gpuon the GPU dispatch path.allow_internal_only_jpeg (bool) – [internal-only] Opt in to the
compression='jpeg'encode path (defaultFalse). The encoder writes self-contained JFIF tiles without the TIFF JPEGTables tag (347); the file decodes through this library’s reader but not through libtiff, GDAL, or rasterio. This codec is internal-only for the release contract: it is not externally interoperable and the path exists so xrspatial can round-trip its own JPEG output. With the flag set, the write proceeds and aGeoTIFFFallbackWarningis emitted at call time. Without the flag,compression='jpeg'raisesValueError. The kwarg is forwarded unchanged to_write_geotiff_gpuon the GPU dispatch path so callers can reach the same experimental encode viato_geotiff(..., gpu=True).allow_unparseable_crs (bool) – [experimental] Opt in to writing an unvalidatable CRS string into
GTCitationGeoKey(defaultFalse). WhenFalse(the default), acrs=value that is neither an EPSG int nor a string that pyproj can resolve and is not structurally WKT (noPROJCS/GEOGCS/PROJCRS/GEOGCRSroot) raisesValueErrorinstead of landing verbatim in the citation field. Set toTrueto keep the permissive behaviour. The fail-closed default protects against files where a typo’d PROJ string or an"EPSG:4326"token on a host without pyproj produces a citation that most readers cannot interpret.drop_rotation (bool, default False) – [advanced] Opt in to writing a DataArray that carries
attrs['rotated_affine']. The reader sets that attr when called withallow_rotated=Trueon a file whoseModelTransformationTagcontains rotation, shear, or z-coupling terms. The writer does not emit aModelTransformationTag, so passing such a DataArray throughto_geotiffproduces an identity-affine output and loses the rotated mapping; a subsequentopen_geotiffround-trip cannot recover it. DefaultFalserefuses the write withValueErrorso the loss is impossible without an explicit signal.drop_rotation=Trueaccepts the loss and lets the write proceed; consumers reading the output will see an axis-aligned, non-rotated TIFF.pack (bool, default False) – [experimental] Inverse of
open_geotiff(unpack=True). Re-pack a decoded float array before writing: reverse the scale / offset recorded onattrs['scale_factor']/attrs['add_offset'], fill NaN back to the nodata sentinel, and cast to the source dtype recorded onattrs['mask_and_scale_dtype'](contract v5; the attr keeps its historical name). The recorded dtype is the on-disk one, so an integer source gets its integer dtype back and a float32 source stays float32 rather than widening to float64 (#3080). The output stores the raw packed values and keeps the SCALE / OFFSET GDAL_METADATA, so reopening it withunpack=Trueunpacks to the original values instead of scaling a second time. RaisesValueErrorfor a bare array (no attrs) or one that never went through anunpackread. When the attr is absent (arrays read before contract v5), the dtype falls back to the width of an integer-typedattrs['nodata'], or to the buffer’s own dtype when the sentinel is a float (a float sentinel is stored as a plain Python float and carries no width information). An explicitnodata=kwarg overrides the attrs sentinel as the NaN fill value, so the filled pixels always agree with the GDAL_NODATA tag the writer emits. When NaN pixels exist with no sentinel to fill them and an integer dtype must be restored, theValueErroris raised at call time for numpy-backed data; for dask-backed data the check runs per chunk inside the write’s compute (issue #3235), so the error surfaces during the write. The same timing applies to pixels whose packed value is non-finite or falls outside the packed integer dtype’s range: the cast would silently wrap them, so the write is refused withValueErrorinstead (issue #3260). The write itself stays atomic (temp file plus rename), so no partial output is left at the destination path.color_ramp (str, bool, or None, default None) – [advanced] Write best-practice symbology sidecars so a continuous single-band raster opens in QGIS with a color ramp instead of a flat grayscale stretch. Pass a ramp name (
'viridis'– the default –'plasma','magma','inferno','cividis','greys','spectral','terrain') orTruefor viridis; an unknown name raisesValueError.'greys'follows matplotlib’s light-to-dark orientation (low values render light). Two sidecars are written: a QGIS.qmlstyle (<base>.qml) with asinglebandpseudocolorrenderer, andSTATISTICS_MINIMUM/MAXIMUM/MEAN/STDDEVin the PAM<file>.aux.xml. No-op for a categorical raster (one withattrs['category_names']– those get the RAT sidecar instead), a multiband array, a file-like destination, or data with no finite values. Computing the statistics is a separate reduction pass over the data; for a dask source that means reading the graph once more (seecolor_ramp_rangeto skip it). Ignored whenpack=True, whose on-disk packed values would not match a ramp built from the logical values.color_ramp_range (tuple of (float, float) or None, default None) – [advanced] Explicit
(min, max)for thecolor_rampstretch. Skips the statistics reduction – useful for large dask graphs – so onlySTATISTICS_MINIMUM/STATISTICS_MAXIMUMare written (mean/stddev need the pass it avoids). Ignored whencolor_rampis not set.
- Returns:
The
pathargument (a string for filesystem paths, the file-like object for BytesIO destinations). Returning the path lines up with_build_vrtand lets callers chain a write into a read without round-tripping through a variable; existing callers that discarded the previousNonereturn are unaffected.- Return type:
str or binary file-like
- Raises:
ValueError – If
data.attrs['transform']is a rotated or skewed affine (b != 0ord != 0in rasterioAffineordering). The on-disk GeoTIFF is axis-aligned; reproject onto an axis-aligned grid first.ValueError – If
data.attrs['rotated_affine']is set anddrop_rotationis False. The reader sets that attr on theallow_rotated=Truepath; the writer cannot emit aModelTransformationTagso writing would silently lose the rotation. Passdrop_rotation=Trueto accept the loss explicitly.ValueError – If
datais a 3D DataArray whosedimsis not(band, y, x)or(y, x, band)(accepting the band-name aliasesbands/channeland spatial-name aliaseslat/lon/latitude/longitude/row/col). A leading non-band dim such astimeis rejected because the writer cannot infer the band axis from arbitrary names and used to silently treat the leading axis asy.
Examples
Write a DataArray to a plain GeoTIFF and read it back:
>>> from xrspatial.geotiff import open_geotiff, to_geotiff >>> to_geotiff(data, 'elevation.tif') >>> da = open_geotiff('elevation.tif')
Write a Cloud Optimized GeoTIFF (tiled, with internal overviews):
>>> to_geotiff(data, 'elevation_cog.tif', cog=True)
Write a VRT mosaic. A
.vrtoutput path tiles the array and emits the index that references the tiles:>>> to_geotiff(data, 'mosaic.vrt')