Updating golden-corpus baselines#

The geotiff golden corpus (issue #1930) pins every backend to a rasterio reference read for every fixture. The corpus lives at xrspatial/geotiff/tests/golden_corpus/ and is generated deterministically from manifest.yaml. The committed .tif files under fixtures/ are byte-stable across regenerations; this is what makes “pixel parity” mean something.

This page is for the case where a code change legitimately changes the pixels a backend produces. The corpus is intentionally rigid – any unintended change shows up as a test failure – so blessing a change is an explicit step.

Note

This page describes the system as of the full Phase 3 (backend wiring) and Phase 4.1 (fast / slow split) PR set. File and symbol references (_marks.py, _PARITY_GAPS, the per-backend test modules) assume those PRs have merged. If one is still in flight, check the matching open PR before chasing a missing file.

When a corpus test fails#

Failure modes you might see:

  1. compare_to_oracle raises AssertionError: pixel arrays differ. Either the reader changed how it decodes the fixture, or the fixture bytes changed. Compare the fixture’s md5sum against the value from main; if it shifted, the fixture moved.

  2. compare_to_oracle raises AssertionError: dtype mismatch or transform mismatch. The fixture is intact but the reader changed how it surfaces an attr. Check whether the attr is in the oracle’s canonical-attr set (today: crs / transform / nodata / dtype); if so, the canonical-attrs contract is what changed.

  3. An xfail(strict=True) test flips to XPASS. This is the “regression-of-regressions” surface: a parity gap that was previously documented in _PARITY_GAPS is now passing the oracle. Remove the entry from the _PARITY_GAPS dict and the matching docstring bullet, and merge.

Blessing a pixel-level change#

If a code change makes a backend produce different pixels and the new output is correct, the corpus baseline needs updating. The workflow:

  1. Confirm the change is intentional. Read the failing test output and the diff. If you cannot articulate why the new bytes are correct in one sentence, the change is probably a bug, not a baseline refresh.

  2. Regenerate any affected fixtures. From the repository root:

    python -m xrspatial.geotiff.tests.golden_corpus.generate
    

    Or, for one fixture by id (one-line invocation; portable across POSIX shells and cmd.exe):

    python -m xrspatial.geotiff.tests.golden_corpus.generate --only <fixture-id>
    

    The generator is deterministic. If two runs produce different bytes for the same fixture, that is the bug – file an issue rather than committing the unstable output.

  3. Inspect the changed bytes. git diff --stat on the fixtures/ tree shows which files moved. rasterio info <path> on the old vs new fixture is a low-effort sanity check that the change is what you expected (e.g. tile layout, compression).

  4. Run the full corpus matrix. All backend modules under xrspatial/geotiff/tests/golden_corpus/test_*.py should pass against the new baseline. pytest runs them all; pytest -m "not slow" runs the PR fast lane only.

  5. Commit the manifest change and the regenerated bytes together. The PR description must answer: what changed, why is the new value correct, and what code change made it correct. A baseline refresh PR without a one-paragraph justification will be sent back.

Adding a new fixture#

Adding a fixture is cheaper than blessing a baseline change because no existing test is asserting against the new bytes yet:

  1. Add the entry to fixtures: in xrspatial/geotiff/tests/golden_corpus/manifest.yaml. Re-use the defaults block; only override what is interesting for the new axis you are exercising. The schema is documented at the top of the manifest file and enforced by generate.validate().

  2. Run python -m xrspatial.geotiff.tests.golden_corpus.generate --only <new-id>. Confirm the file lands under fixtures/ and stays under the 12 KB per-fixture budget (the smoke tests enforce this).

  3. If the new fixture exercises a backend gap that is not in any of the per-module _PARITY_GAPS tables, expect the parametrised test to fail. Either fix the gap or add the fixture id to _PARITY_GAPS with a reason and a follow-up trail. Do not just merge a failing test.

Fast / slow split#

Each fixture’s tags: list controls which lane it runs in. A fixture is fast iff "fast" is in its tags; everything else picks up pytest.mark.slow via the helper in xrspatial/geotiff/tests/golden_corpus/_marks.py. See the corpus README.md next to that helper for a shorter version of this same table aimed at someone editing the manifest directly.

  • pytest: runs every cell.

  • pytest -m "not slow": PR fast lane.

  • pytest -m slow: only the slow cells; useful for a nightly job.

When adding a fixture that decodes in well under a second, include fast in its tags. Reserve the slow lane for fixtures that are genuinely heavy (large COGs, jpeg2000, multi-source VRTs).

Reference points#

  • Issue #1930 – the tracking issue, including the phase plan.

  • xrspatial/geotiff/tests/golden_corpus/_oracle.py – the oracle contract. compare_to_oracle(path, candidate, lossy=False) is the single function every backend test calls; it asserts pixels (bit- exact or shape-only when lossy), dtype, transform, CRS, and nodata.

  • xrspatial/geotiff/tests/golden_corpus/manifest.yaml – the schema-documented list of fixtures. Edit here, regenerate, commit.