From 2652d9e6da41f92f3dab1a6eb7eb6bed0f817cb3 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 10:23:50 +0200 Subject: [PATCH 01/25] docs: spec for closing the silently-unexecuted docs-block gap (#4016) Design for fixing issue #4016 (invalid create_array(mode="w") in docs) and preventing recurrence: per-case remediation of the 12 non-executed python blocks (S3 via moto, config blocks via exec="true", GPU via the gpu marker, explicit opt-out for non-Python blocks) plus a guard test asserting every docs python block is either executed or explicitly opted out with a reason. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...2026-05-29-docs-block-validation-design.md | 182 ++++++++++++++++++ 1 file changed, 182 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-29-docs-block-validation-design.md diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md new file mode 100644 index 0000000000..db627d7528 --- /dev/null +++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md @@ -0,0 +1,182 @@ +# Design: Close the "silently-unexecuted docs block" gap + +**Date:** 2026-05-29 +**Issue:** [zarr-developers/zarr-python#4016](https://github.com/zarr-developers/zarr-python/issues/4016) + +## Problem & root cause + +Issue #4016 reports invalid code in the docs: + +```python +z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10), dtype="f4") +``` + +`create_array` has no `mode` parameter, so this raises `TypeError: unexpected keyword +argument 'mode'`. The code was wrong because **nothing validated it**: it is a bare +` ```python ` block, and both the renderer (`markdown-exec`) and the test suite +(`tests/test_docs.py`, which filters on `settings.get("exec") != "true"`) only act on +blocks tagged `exec="true"`. Omitting that attribute is a *silent* opt-out from all +validation. + +This is not a one-off. An audit of all docs found **12 of 180** python blocks +unexecuted, including a second instance of the same failure mode: +`docs/contributing.md:231` is tagged `exec="on"` (a typo for `"true"`), so a block +meant to run silently does not. + +**Root cause:** validation is opt-in via an easily-mistyped, easily-omitted attribute, +with no signal when a block opts out. + +### Audit of the 12 bare blocks + +| Block | Why bare | Disposition | +|---|---|---| +| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute against moto** mock-S3 | +| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, gated on `gpu` marker** (runs in `gputest` env) | +| `docs/user-guide/performance.md:207` | left bare | **`exec="true"`** (plain `zarr.config.set`) | +| `docs/user-guide/performance.md:237` | left bare | **`exec="true"`** | +| `docs/user-guide/performance.md:263` | left bare | **`exec="true"`** (uses dask + a local array path) | +| `docs/user-guide/arrays.md:622` | left bare | **`exec="true"`** (`zarr.config.set`) | +| `docs/user-guide/cli.md:48` | left bare | **`exec="true"`** (`zarr.open`; needs a runnable path/store) | +| `docs/contributing.md:231` | **`exec="on"` typo** | **Fix typo** → `exec="true"` | +| `docs/contributing.md:15` | pseudocode (`# etc.`) | **Explicit opt-out** + reason | +| `docs/user-guide/data_types.md:363` | REPL transcript (``) | **Explicit opt-out** + reason | +| `docs/user-guide/examples/custom_dtype.md:5` | `--8<--` file include | **Explicit opt-out** + reason | +| `docs/user-guide/v3_migration.md:42` | intentionally-wrong import | **Explicit opt-out** + reason | + +(`performance.md:263` and `cli.md:48` need a small adjustment — a memory store or a +real local path — to be runnable; confirm during implementation.) + +## Approach + +Two complementary parts. + +### Part A — Per-case remediation of the 12 bare blocks + +Not one mechanism — a triage. Each block gets the treatment that fits *why* it is not +executing: + +- **Make executable against fakes** — the S3 example. Reuse the repo's existing `moto` + mock-S3 pattern from `tests/test_store/test_fsspec.py` so the block runs for real in + CI with no real-cloud contact. Execution validates the whole write path, not just the + signature; `mode="w"` dies by construction. +- **Just turn on** — the config/open blocks (`performance.md` ×3, `arrays.md:622`, + `cli.md:48`) are plain runnable API calls; flip them to `exec="true"`. +- **Fix the typo** — `contributing.md:231` `exec="on"` → `exec="true"`. +- **Execute, env-gated** — the GPU block. It *can* run, but only in the `gputest` env + (cupy + GPU hardware), not the default `doctest` env. See "Env-gated execution". +- **Explicit opt-out** — blocks that genuinely cannot run anywhere and are not + executable Python: REPL transcript, `--8<--` include, intentionally-wrong import, + pseudocode. These get a *documented, greppable* opt-out marker carrying a reason. + +### Part B — A guard test + +So the gap cannot silently reopen: every python block in `docs/` must either be +`exec="true"` *or* carry the explicit opt-out marker with a reason. A bare or +mistyped block fails the guard. This would have caught both `mode="w"` and the +`exec="on"` typo. + +### Dropped from scope + +The type-checking / markdown-extractor machinery considered earlier. Execution-against- +fakes strictly dominates type-checking for the cloud case (and the untyped `s3fs`/`cupy` +imports make strict type-checking least clean exactly where it was wanted most), and the +guard handles everything else. Proportionate to ~7 genuinely-affected blocks. + +## Key insight: doctests are already pytest tests + +`tests/test_docs.py::test_documentation_examples` is an ordinary `@pytest.mark.parametrize`d +pytest test — one case per `(file, session)`. It is not a separate doctest mechanism. +Therefore everything pytest already provides for gating tests (markers, `-m` selection, +skips) is available; the design uses it rather than inventing harness concepts. + +There are two distinct executors of docs blocks, and conflating them is what made +env-gating look hard: + +- **`markdown-exec` at docs-build time** — runs blocks to render output into the + published site. Build runners have no cupy, so a GPU block must render as static + source here (no build-time execution). +- **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is + where markers live and where env-gating happens. + +## Env-gated execution + +A block declares the pytest marker it needs via a **fence attribute**, e.g.: + +```` +```python exec="true" markers="gpu" +```` + +`group_examples_by_session()` parses `markers=` and emits +`pytest.param(session_key, marks=pytest.mark.gpu)`. Then: + +- Default `doctest` env runs `pytest` → the gpu-marked param is **skipped/deselected**, + exactly like every other `gpu`-marked test in `tests/`. +- The `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy. + +This reuses the existing `gpu` marker (`pyproject.toml`, `markers` table) and the existing +`pytest -m gpu` selection — no new harness concept. + +## Components & data flow + +**`docs/` markdown** — source of truth. Each python block is in one of three declared +states: + +1. `exec="true"` (optionally `+ markers=""`) — executed as a test. +2. explicit opt-out marker **with a reason** — deliberately not executed. +3. anything else (bare, `exec="on"`, …) — **illegal**, fails the guard. + +The exact spelling of the opt-out marker (e.g. `exec="false"` plus a `reason="..."` +attribute, versus a dedicated sentinel attribute) is an implementation-plan decision. +Requirement: it must be explicit, greppable, carry a human-readable reason, and be a +form `markdown-exec` will not execute at build time. + +**`tests/test_docs.py`** — already-parametrized pytest harness. Changes: + +- `group_examples_by_session()` parses the `markers=` attribute and emits + `pytest.param(..., marks=pytest.mark.)` so env-gating rides existing marker machinery. +- New guard test `test_no_unvalidated_blocks` — walks every python block in `docs/`, + asserts each is `exec="true"` or carries the explicit opt-out marker. Fails on + bare/typo'd blocks. + +**`docs/quick-start.md` S3 session** — a hidden setup block (`exec="true"`, no `source=`, +matching the existing setup block at `quick-start.md:8`) starts a `moto` server and +registers a default endpoint so the *visible* `create_array("s3://...")` block runs +against the fake. Pattern lifted from `tests/test_store/test_fsspec.py`. + +## Risks & spikes (resolve during implementation; do not guess) + +1. **Default S3 endpoint without `storage_options`.** Existing tests always pass + `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm a setup block can register + a *process-wide* default endpoint (via `fsspec.config` or `AWS_ENDPOINT_URL`) so the + visible `create_array("s3://...")` works clean. **Fallback:** show the honest + `storage_options={"endpoint_url": ...}` form in the visible block. + +2. **`markdown-exec` + unknown `markers=` attribute.** Confirm the build-time renderer + ignores `markers=` (or is told to), and decide how a `gpu` block renders in the + published site without cupy (render source only, no build-time execution). + **Fallback:** a per-session marker map in `test_docs.py`, keeping markdown untouched. + +3. **moto teardown in the docs session.** `s3fs`/`aiobotocore` finalizers are known to be + noisy at teardown (see the filterwarnings note in `pyproject.toml`). Ensure the docs + session's moto server starts/stops cleanly without leaking into other sessions. + +## Testing the change + +- Guard test is self-validating: after remediation, the full docs suite passes with zero + bare/typo'd blocks. +- Negative check: temporarily introduce a bare block, confirm the guard fails, remove it. +- S3 block: `hatch run doctest:test` runs it green against moto. +- GPU block: `pytest -m gpu` in `gputest` executes it; the default `doctest` run reports + it **skipped**, not absent. + +## Out of scope + +- Type-checking machinery / markdown extractor. +- The 168 already-executing blocks. +- Broad docs rewrites beyond the 12 bare blocks. + +## Upstream + +A separate GitHub issue will capture the root-cause framing (silent opt-out hides bugs; +`mode="w"` and `exec="on"` as two instances) and the Part B guard proposal for community +discussion, independent of the immediate fix. From 33590eba2dc0a66cfe34f3616365733207a695cf Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 10:33:00 +0200 Subject: [PATCH 02/25] docs: unify docs-block marker model (s3 + gpu) in #4016 spec Replace the bespoke hidden-moto-setup-block approach for the S3 example with a marker-bound model: a block declares markers="s3"/"gpu" on the fence, and the harness binds each marker to the infra/env it needs (s3 -> moto fixture in the default doctest env; gpu -> gputest env via pytest -m gpu). Symmetric declaration; the asymmetry is only in what each marker resolves to. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...2026-05-29-docs-block-validation-design.md | 98 ++++++++++++------- 1 file changed, 62 insertions(+), 36 deletions(-) diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md index db627d7528..f46b351856 100644 --- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md +++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md @@ -30,8 +30,8 @@ with no signal when a block opts out. | Block | Why bare | Disposition | |---|---|---| -| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute against moto** mock-S3 | -| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, gated on `gpu` marker** (runs in `gputest` env) | +| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute, `markers="s3"`** (moto infra, default doctest env) | +| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, `markers="gpu"`** (runs in `gputest` env) | | `docs/user-guide/performance.md:207` | left bare | **`exec="true"`** (plain `zarr.config.set`) | | `docs/user-guide/performance.md:237` | left bare | **`exec="true"`** | | `docs/user-guide/performance.md:263` | left bare | **`exec="true"`** (uses dask + a local array path) | @@ -55,15 +55,17 @@ Two complementary parts. Not one mechanism — a triage. Each block gets the treatment that fits *why* it is not executing: -- **Make executable against fakes** — the S3 example. Reuse the repo's existing `moto` - mock-S3 pattern from `tests/test_store/test_fsspec.py` so the block runs for real in - CI with no real-cloud contact. Execution validates the whole write path, not just the - signature; `mode="w"` dies by construction. +- **Make executable against fakes** — the S3 example, via `markers="s3"`. The marker + binds the block to the repo's existing `moto` mock-S3 infra (pattern from + `tests/test_store/test_fsspec.py`) so it runs for real in CI with no real-cloud + contact. Execution validates the whole write path, not just the signature; `mode="w"` + dies by construction. See "Marker-bound execution". - **Just turn on** — the config/open blocks (`performance.md` ×3, `arrays.md:622`, `cli.md:48`) are plain runnable API calls; flip them to `exec="true"`. - **Fix the typo** — `contributing.md:231` `exec="on"` → `exec="true"`. -- **Execute, env-gated** — the GPU block. It *can* run, but only in the `gputest` env - (cupy + GPU hardware), not the default `doctest` env. See "Env-gated execution". +- **Execute, env-gated** — the GPU block, via `markers="gpu"`. It *can* run, but only in + the `gputest` env (cupy + GPU hardware), not the default `doctest` env. See + "Marker-bound execution". - **Explicit opt-out** — blocks that genuinely cannot run anywhere and are not executable Python: REPL transcript, `--8<--` include, intentionally-wrong import, pseudocode. These get a *documented, greppable* opt-out marker carrying a reason. @@ -90,31 +92,45 @@ Therefore everything pytest already provides for gating tests (markers, `-m` sel skips) is available; the design uses it rather than inventing harness concepts. There are two distinct executors of docs blocks, and conflating them is what made -env-gating look hard: +marker-bound execution look hard: - **`markdown-exec` at docs-build time** — runs blocks to render output into the - published site. Build runners have no cupy, so a GPU block must render as static - source here (no build-time execution). + published site. Build runners have no cupy (and the S3 setup is test infra), so a + marker-bound block must render as static source here (no build-time execution). - **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is - where markers live and where env-gating happens. + where markers live, where infra fixtures bind, and where env-gating happens. -## Env-gated execution +## Marker-bound execution A block declares the pytest marker it needs via a **fence attribute**, e.g.: ```` ```python exec="true" markers="gpu" +```python exec="true" markers="s3" ```` `group_examples_by_session()` parses `markers=` and emits -`pytest.param(session_key, marks=pytest.mark.gpu)`. Then: - -- Default `doctest` env runs `pytest` → the gpu-marked param is **skipped/deselected**, - exactly like every other `gpu`-marked test in `tests/`. -- The `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy. - -This reuses the existing `gpu` marker (`pyproject.toml`, `markers` table) and the existing -`pytest -m gpu` selection — no new harness concept. +`pytest.param(session_key, marks=pytest.mark.)`. The marker then **binds the case to +whatever that marker means** — and the two markers mean different things, which is the +point of unifying the model rather than special-casing each: + +- **`gpu` — env-gate.** Default `doctest` env runs `pytest` → the gpu-marked param is + **skipped/deselected**, exactly like every other `gpu`-marked test in `tests/`. The + `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy. Reuses + the existing `gpu` marker (`pyproject.toml` `markers` table) and `pytest -m gpu` + selection — no new harness concept. + +- **`s3` — infra-binding.** A new `s3` marker (must be registered in the `markers` + table). An autouse-style fixture keyed on the marker stands up the `moto` server and + registers a default endpoint, so an `s3`-marked docs case runs against the fake S3 + with no real-cloud contact. Because the infra is just pip deps already present in the + `doctest` env (`s3fs`, `moto[s3,server]`), the case **runs in the default doctest + run** — the marker binds infra, it does not gate the case out. The moto/endpoint + plumbing lives in named pytest fixtures, not a hidden markdown setup block. + +Both blocks therefore follow one rule: *declare the marker; the harness binds the marker +to the infra/env it needs.* The asymmetry is in what each marker resolves to (gpu → +hardware env, s3 → fixture), not in the declaration mechanism. ## Components & data flow @@ -133,39 +149,49 @@ form `markdown-exec` will not execute at build time. **`tests/test_docs.py`** — already-parametrized pytest harness. Changes: - `group_examples_by_session()` parses the `markers=` attribute and emits - `pytest.param(..., marks=pytest.mark.)` so env-gating rides existing marker machinery. + `pytest.param(..., marks=pytest.mark.)` so marker-binding rides existing marker + machinery. +- A marker-keyed fixture for `s3` that stands up the `moto` server and registers a + default endpoint (pattern lifted from `tests/test_store/test_fsspec.py`), applied to + `s3`-marked docs cases. - New guard test `test_no_unvalidated_blocks` — walks every python block in `docs/`, asserts each is `exec="true"` or carries the explicit opt-out marker. Fails on bare/typo'd blocks. -**`docs/quick-start.md` S3 session** — a hidden setup block (`exec="true"`, no `source=`, -matching the existing setup block at `quick-start.md:8`) starts a `moto` server and -registers a default endpoint so the *visible* `create_array("s3://...")` block runs -against the fake. Pattern lifted from `tests/test_store/test_fsspec.py`. +**`pyproject.toml`** — register the new `s3` marker in the `markers` table (alongside +`gpu`). + +**`docs/quick-start.md` S3 block** — gains `markers="s3"`. The visible code stays a clean +`create_array("s3://...")`; the moto server and default-endpoint registration are +supplied by the `s3` fixture, not by an in-markdown setup block. ## Risks & spikes (resolve during implementation; do not guess) 1. **Default S3 endpoint without `storage_options`.** Existing tests always pass - `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm a setup block can register + `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm the `s3` fixture can register a *process-wide* default endpoint (via `fsspec.config` or `AWS_ENDPOINT_URL`) so the - visible `create_array("s3://...")` works clean. **Fallback:** show the honest - `storage_options={"endpoint_url": ...}` form in the visible block. + visible `create_array("s3://...")` works clean with no `storage_options`. **Fallback:** + show the honest `storage_options={"endpoint_url": ...}` form in the visible block. 2. **`markdown-exec` + unknown `markers=` attribute.** Confirm the build-time renderer - ignores `markers=` (or is told to), and decide how a `gpu` block renders in the - published site without cupy (render source only, no build-time execution). - **Fallback:** a per-session marker map in `test_docs.py`, keeping markdown untouched. + ignores `markers=` (or is told to), and that marker-bound blocks render as static + source in the published site (render source only, no build-time execution — the build + has neither cupy nor the moto fixture). **Fallback:** a per-session marker map in + `test_docs.py`, keeping markdown untouched. -3. **moto teardown in the docs session.** `s3fs`/`aiobotocore` finalizers are known to be - noisy at teardown (see the filterwarnings note in `pyproject.toml`). Ensure the docs - session's moto server starts/stops cleanly without leaking into other sessions. +3. **moto teardown / loop affinity in the docs session.** `s3fs`/`aiobotocore` finalizers + are noisy at teardown and s3fs instances bind to the event loop they were created on + (see the filterwarnings note in `pyproject.toml` and the loop comments in + `test_fsspec.py`). Ensure the docs `s3` fixture starts/stops moto cleanly and does not + leak across sessions/tests. ## Testing the change - Guard test is self-validating: after remediation, the full docs suite passes with zero bare/typo'd blocks. - Negative check: temporarily introduce a bare block, confirm the guard fails, remove it. -- S3 block: `hatch run doctest:test` runs it green against moto. +- S3 block: `hatch run doctest:test` runs it green against moto in the default doctest + env (the `s3` marker binds the fixture; it is not gated out). - GPU block: `pytest -m gpu` in `gputest` executes it; the default `doctest` run reports it **skipped**, not absent. From 47d20c32b24b3ed327a6c486ac0d3b86b757fbf6 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 10:39:40 +0200 Subject: [PATCH 03/25] docs: link spec to upstream issue #4017 Co-Authored-By: Claude Opus 4.8 (1M context) --- .../specs/2026-05-29-docs-block-validation-design.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md index f46b351856..4e8232846f 100644 --- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md +++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md @@ -203,6 +203,7 @@ supplied by the `s3` fixture, not by an in-markdown setup block. ## Upstream -A separate GitHub issue will capture the root-cause framing (silent opt-out hides bugs; -`mode="w"` and `exec="on"` as two instances) and the Part B guard proposal for community -discussion, independent of the immediate fix. +[zarr-developers/zarr-python#4017](https://github.com/zarr-developers/zarr-python/issues/4017) +captures the root-cause framing (silent opt-out hides bugs; `mode="w"` and `exec="on"` as +two instances) and the Part B guard proposal for community discussion, independent of the +immediate fix in #4016. From c660818b2ae58983311a2d5b8a906eacdbe7feb0 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 10:49:29 +0200 Subject: [PATCH 04/25] docs: implementation plan for docs-block validation (#4016, #4017) Co-Authored-By: Claude Opus 4.8 (1M context) --- .../plans/2026-05-29-docs-block-validation.md | 722 ++++++++++++++++++ 1 file changed, 722 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-29-docs-block-validation.md diff --git a/docs/superpowers/plans/2026-05-29-docs-block-validation.md b/docs/superpowers/plans/2026-05-29-docs-block-validation.md new file mode 100644 index 0000000000..1c34652012 --- /dev/null +++ b/docs/superpowers/plans/2026-05-29-docs-block-validation.md @@ -0,0 +1,722 @@ +# Docs Block Validation Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Make every python code block in `docs/` either execute (and thus get validated) or explicitly opt out with a documented reason, and add a guard test so a block can never again silently opt out of validation. + +**Architecture:** The doctests in `tests/test_docs.py` are already parametrized pytest tests. We (1) teach the parametrizer to read a `markers="..."` fence attribute and attach the matching pytest marker to each session's `pytest.param`, (2) add an `s3` marker bound to a `moto` mock-S3 fixture so the S3 example runs in the default doctest env, (3) reuse the existing `gpu` marker for the GPU block, (4) remediate the 12 currently-unexecuted blocks per-case, and (5) add a guard test asserting every docs python block is `exec="true"` or explicitly opted out with a reason. + +**Tech Stack:** pytest, pytest-examples, markdown-exec (mkdocs), moto[s3,server], s3fs, hatch envs (`doctest`, `gputest`). + +**Upstream:** Fixes [#4016](https://github.com/zarr-developers/zarr-python/issues/4016); implements the guard from [#4017](https://github.com/zarr-developers/zarr-python/issues/4017). Design spec: `docs/superpowers/specs/2026-05-29-docs-block-validation-design.md`. + +--- + +## File Structure + +- `tests/test_docs.py` — **modify.** Add `markers=` parsing in `group_examples_by_session()`, an `s3` fixture + marker-binding, and the new `test_no_unvalidated_blocks` guard test. +- `pyproject.toml` — **modify.** Register the `s3` marker in `[tool.pytest.ini_options] markers`. +- `docs/quick-start.md` — **modify.** S3 block: fix `mode="w"`, add `markers="s3"`, make it executable. +- `docs/user-guide/performance.md` — **modify.** Turn on the two config-only blocks; opt out (or fix) the dask block. +- `docs/user-guide/arrays.md` — **modify.** Turn on the config block. +- `docs/user-guide/cli.md` — **modify.** Make the `zarr.open` block runnable or opt it out. +- `docs/user-guide/gpu.md` — **modify.** Add `exec="true" markers="gpu"`. +- `docs/contributing.md` — **modify.** Fix `exec="on"` typo; opt out the pseudocode block. +- `docs/user-guide/data_types.md` — **modify.** Opt out the REPL-transcript block. +- `docs/user-guide/examples/custom_dtype.md` — **modify.** Opt out the `--8<--` include block. +- `docs/user-guide/v3_migration.md` — **modify.** Opt out the intentionally-wrong-import block. +- `changes/4016.bugfix.md` — **create.** Towncrier news fragment. + +### Opt-out convention (decided here, used throughout) + +A block that must not execute is tagged: + +```` +```python exec="false" reason="" +```` + +- `exec="false"` is an explicit, greppable opt-out that `markdown-exec` will **not** execute (only `exec="true"` triggers execution). +- `reason="..."` documents *why*. The guard test requires it on any non-`exec="true"` block. + +--- + +## Task 1: Spike — can the `s3` fixture provide a default endpoint with no `storage_options`? + +This is the load-bearing unknown. The existing S3 tests always pass `endpoint_url` explicitly via `client_kwargs`/`storage_options` (`tests/test_store/test_fsspec.py:109-116, 131`). The docs block must read clean — `zarr.create_array("s3://...")` with **no** `storage_options`. We must confirm a process-wide default endpoint works before writing the real fixture. + +**Files:** +- Test (scratch): `tests/test_docs_s3_spike.py` (deleted at end of task) + +- [ ] **Step 1: Write a scratch test that starts moto, sets a default endpoint via env, and creates an array with a bare `s3://` URL** + +```python +# tests/test_docs_s3_spike.py +import os + +import pytest + +moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") +pytest.importorskip("s3fs") +botocore = pytest.importorskip("botocore") +requests = pytest.importorskip("requests") + +PORT = 5556 # different from test_fsspec.py's 5555 to avoid collisions +ENDPOINT = f"http://127.0.0.1:{PORT}/" + + +def test_bare_s3_url_with_default_endpoint() -> None: + """A create_array('s3://...') call with no storage_options should reach a + moto server when the endpoint is configured process-wide (env var).""" + server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=PORT) + server.start() + try: + os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") + os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") + # Candidate mechanism A: aiobotocore/botocore honors AWS_ENDPOINT_URL + os.environ["AWS_ENDPOINT_URL"] = ENDPOINT + + # create the bucket via boto3 sync client + session = botocore.session.Session() + client = session.create_client("s3", endpoint_url=ENDPOINT, region_name="us-east-1") + client.create_bucket(Bucket="docs-bucket") + client.close() + + import s3fs + + import zarr + + s3fs.S3FileSystem.clear_instance_cache() + z = zarr.create_array( + "s3://docs-bucket/foo", shape=(8, 8), chunks=(4, 4), dtype="f4" + ) + z[:, :] = 1.0 + assert z[0, 0] == 1.0 + finally: + requests.post(f"{ENDPOINT}/moto-api/reset") + server.stop() +``` + +- [ ] **Step 2: Run the spike** + +Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v` (or `uv run pytest tests/test_docs_s3_spike.py -v` inside the doctest env) +Expected: **One of two outcomes** — record which: +- **PASS** → `AWS_ENDPOINT_URL` works as a process-wide default. Use env-var mechanism in Task 3. +- **FAIL** (connection refused / NoCredentials / hits real AWS) → env var insufficient. Try candidate B below. + +- [ ] **Step 3: If Step 2 failed, try fsspec default config** + +Replace the `AWS_ENDPOINT_URL` line with: + +```python + import fsspec + + fsspec.config.conf["s3"] = {"client_kwargs": {"endpoint_url": ENDPOINT}, "anon": False} +``` + +Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v` +Expected: PASS → use `fsspec.config.conf` mechanism in Task 3. + +- [ ] **Step 4: If both failed, record the fallback decision** + +If neither bare-URL mechanism works, the visible block will show `storage_options={"endpoint_url": ...}` honestly (spec fallback for spike #1). Note which mechanism (env var, fsspec config, or fallback) won, in the commit message — Task 3 depends on it. + +- [ ] **Step 5: Delete the scratch test and commit the finding** + +```bash +git rm tests/test_docs_s3_spike.py +git commit -m "test: spike s3 default-endpoint mechanism for docs (no storage_options) + +Result: " +``` + +--- + +## Task 2: Register the `s3` pytest marker + +**Files:** +- Modify: `pyproject.toml` (the `[tool.pytest.ini_options]` `markers` list, currently at lines 446-450) + +- [ ] **Step 1: Add the `s3` marker** + +In `pyproject.toml`, change the `markers` list from: + +```toml +markers = [ + "asyncio: mark test as asyncio test", + "gpu: mark a test as requiring CuPy and GPU", + "slow_hypothesis: slow hypothesis tests", +] +``` + +to: + +```toml +markers = [ + "asyncio: mark test as asyncio test", + "gpu: mark a test as requiring CuPy and GPU", + "s3: mark a test as requiring a (mock) S3 backend via moto", + "slow_hypothesis: slow hypothesis tests", +] +``` + +- [ ] **Step 2: Verify pytest accepts the marker (no unknown-marker warning)** + +Run: `hatch run doctest:test --markers | grep s3` +Expected: shows `@pytest.mark.s3: mark a test as requiring a (mock) S3 backend via moto` + +- [ ] **Step 3: Commit** + +```bash +git add pyproject.toml +git commit -m "test: register s3 pytest marker" +``` + +--- + +## Task 3: Teach `test_docs.py` to parse `markers=` and bind the `s3` fixture + +This task adds (a) `markers=` parsing so a session carries the right pytest marker, and (b) the moto-backed `s3` fixture using the mechanism chosen in Task 1. + +**Files:** +- Modify: `tests/test_docs.py` + +- [ ] **Step 1: Write a failing test that a markered session carries its marker** + +Add to `tests/test_docs.py`: + +```python +def test_markers_attribute_is_parsed(tmp_path: Path) -> None: + """A block tagged markers="s3" must surface that marker on its parametrized case, + so pytest can gate/bind it (e.g. attach the moto fixture).""" + md = tmp_path / "ex.md" + md.write_text( + '```python exec="true" session="demo" markers="s3"\n' + "import zarr\n" + "```\n", + encoding="utf-8", + ) + params = _session_params(md.parent) + assert len(params) == 1 + marks = params[0].marks + assert any(m.name == "s3" for m in marks) +``` + +(This references a new helper `_session_params(root)` that returns a list of `pytest.param(...)`; we extract the grouping logic into it in Step 3.) + +- [ ] **Step 2: Run it to confirm it fails** + +Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v` +Expected: FAIL with `AttributeError: module ... has no attribute '_session_params'` (or `NameError`). + +- [ ] **Step 3: Refactor grouping into `_session_params` that emits markers** + +Replace `group_examples_by_session()` (currently `tests/test_docs.py:39-64`) and the parametrize decorator (`tests/test_docs.py:72-75`) with a version that returns `pytest.param` objects carrying marks. Add near the top of the file: + +```python +def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]: + """Translate a block's markers="a b" attribute into pytest mark decorators.""" + raw = settings.get("markers", "") + return [getattr(pytest.mark, name) for name in raw.split() if name] + + +def _session_params(root: Path) -> list[pytest.param]: + """Group exec="true" examples by (file, session) and emit one pytest.param per + session, carrying the union of markers declared by that session's blocks.""" + sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list) + marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set) + + for example in find_examples(str(root)): + settings = example.prefix_settings() + if settings.get("exec") != "true": + continue + session_name = settings.get("session", "_default") + key = (str(example.path), session_name) + sessions[key].append(example) + for mark in _markers_for(settings): + marks_by_session[key].add(mark.name) + + params = [] + for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])): + marks = tuple(getattr(pytest.mark, name) for name in sorted(marks_by_session[key])) + params.append(pytest.param(key, marks=marks, id=name_example(key[0], key[1]))) + return params +``` + +Keep `name_example()` as-is. Add `CodeExample` to the existing pytest-examples import if not already imported (it is: `from pytest_examples import CodeExample, EvalExample, find_examples`). + +- [ ] **Step 4: Update the parametrized test to use `_session_params` and request the fixtures** + +Replace the decorator + signature of `test_documentation_examples` (`tests/test_docs.py:72-79`) with: + +```python +@pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT)) +def test_documentation_examples( + session_key: tuple[str, str], + eval_example: EvalExample, + request: pytest.FixtureRequest, +) -> None: +``` + +Inside the body, before running examples, activate the `s3` fixture when the case is s3-marked: + +```python + if request.node.get_closest_marker("s3") is not None: + request.getfixturevalue("docs_s3_backend") +``` + +(Leave the rest of the body — the `find_examples` loop and `eval_example.run(...)` — unchanged.) + +- [ ] **Step 5: Add the `docs_s3_backend` fixture** + +Add to `tests/test_docs.py` (using the mechanism Task 1 selected — shown here for the `AWS_ENDPOINT_URL` variant; swap to `fsspec.config` or the `storage_options` fallback per Task 1's result): + +```python +S3_PORT = 5556 +S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/" +S3_BUCKET = "example-bucket" + + +@pytest.fixture +def docs_s3_backend() -> Generator[None, None, None]: + """Stand up a moto mock-S3 server and configure a process-wide default endpoint + so docs blocks can use a bare s3:// URL with no storage_options.""" + moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") + s3fs = pytest.importorskip("s3fs") + botocore = pytest.importorskip("botocore") + requests = pytest.importorskip("requests") + + server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT) + server.start() + prev_endpoint = os.environ.get("AWS_ENDPOINT_URL") + os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") + os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") + os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT + + session = botocore.session.Session() + client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1") + client.create_bucket(Bucket=S3_BUCKET) + client.close() + s3fs.S3FileSystem.clear_instance_cache() + try: + yield + finally: + requests.post(f"{S3_ENDPOINT}/moto-api/reset") + if prev_endpoint is None: + os.environ.pop("AWS_ENDPOINT_URL", None) + else: + os.environ["AWS_ENDPOINT_URL"] = prev_endpoint + server.stop() +``` + +Add the required imports at the top of `tests/test_docs.py`: + +```python +import os +from collections.abc import Generator +``` + +- [ ] **Step 6: Run the marker-parsing test — it should now pass** + +Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v` +Expected: PASS + +- [ ] **Step 7: Run the full docs test to confirm no regression in existing sessions** + +Run: `hatch run doctest:test -v` +Expected: PASS for all existing `quickstart` etc. sessions (the S3 block isn't markered yet — that's Task 4). + +- [ ] **Step 8: Commit** + +```bash +git add tests/test_docs.py +git commit -m "test: parse markers= on docs blocks and add moto s3 fixture binding" +``` + +--- + +## Task 4: Fix and enable the S3 example (#4016) + +**Files:** +- Modify: `docs/quick-start.md:134-140` + +- [ ] **Step 1: Replace the bare, invalid S3 block** + +Replace lines 134-140 (the ```` ```python `` … ```` block containing `mode="w"`) with: + +````markdown +```python exec="true" session="s3demo" markers="s3" source="above" +import zarr +import numpy as np + +z = zarr.create_array( + "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4" +) +z[:, :] = np.random.random((100, 100)) +``` +```` + +Notes: +- `mode="w"` removed (the #4016 bug; `create_array` has no `mode` parameter — see `src/zarr/api/synchronous.py:799`). +- Unused `import s3fs` removed. +- `import numpy as np` added — this is a fresh `s3demo` session, so `np` is not in scope from the `quickstart` session. +- New session `s3demo` keeps the moto fixture scoped to just this block (the `quickstart` session must NOT become s3-marked). +- The displayed URL stays `s3://example-bucket/foo`; the moto endpoint is supplied by the `docs_s3_backend` fixture (bucket name `example-bucket` matches `S3_BUCKET` in Task 3). +- **If Task 1 chose the `storage_options` fallback:** add `storage_options={"endpoint_url": "..."}` to the visible call instead, and adjust the prose to explain it. + +- [ ] **Step 2: Run the S3 docs example against moto** + +Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[quick-start.md:s3demo]" -v` +Expected: PASS (executes against moto; no real-cloud contact). + +- [ ] **Step 3: Commit** + +```bash +git add docs/quick-start.md +git commit -m "docs: fix invalid s3 create_array example and run it against moto (#4016)" +``` + +--- + +## Task 5: Enable the config-only blocks + +These are plain `zarr.config.set(...)` calls that run as-is. Each gets its own self-contained session so config mutations don't bleed into other examples (config is process-global; reset is out of scope — separate sessions keep ids distinct but note config is not auto-restored, which is acceptable for these read-only-style demos). + +**Files:** +- Modify: `docs/user-guide/performance.md:207`, `docs/user-guide/performance.md:237` +- Modify: `docs/user-guide/arrays.md:622` + +- [ ] **Step 1: Enable `performance.md:207` (concurrency config)** + +Change the fence from ```` ```python ```` to: + +````markdown +```python exec="true" session="perf-concurrency" +```` + +(Body unchanged — `import zarr` + `zarr.config.set({'async.concurrency': 128})` + the commented env-var line, which is inert.) + +- [ ] **Step 2: Enable `performance.md:237` (max_workers config)** + +Change the fence to: + +````markdown +```python exec="true" session="perf-workers" +```` + +- [ ] **Step 3: Enable `arrays.md:622` (rectilinear_chunks config)** + +Change the fence to: + +````markdown +```python exec="true" session="arrays-rectilinear" +```` + +- [ ] **Step 4: Run the three sessions** + +Run: +```bash +hatch run doctest:test \ + "tests/test_docs.py::test_documentation_examples[performance.md:perf-concurrency]" \ + "tests/test_docs.py::test_documentation_examples[performance.md:perf-workers]" \ + "tests/test_docs.py::test_documentation_examples[arrays.md:arrays-rectilinear]" -v +``` +Expected: PASS (3 passed). + +- [ ] **Step 5: Commit** + +```bash +git add docs/user-guide/performance.md docs/user-guide/arrays.md +git commit -m "docs: execute config-setting examples in performance.md and arrays.md" +``` + +--- + +## Task 6: Make the CLI `zarr.open` block runnable + +`docs/user-guide/cli.md:48` opens `'path/to/input.zarr'` which doesn't exist. Rewrite it to create then open a real local array so it executes and still illustrates `zarr_format=3`. + +**Files:** +- Modify: `docs/user-guide/cli.md:46-51` + +- [ ] **Step 1: Replace the block** + +Replace the bare block with: + +````markdown +```python exec="true" session="cli-open" source="above" +import zarr + +# create a small array to open (stands in for the migrated store) +zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4") + +zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3) +``` +```` + +(Keep the surrounding prose; the example now demonstrates `open(..., zarr_format=3)` on a real store. The illustrative `'path/to/input.zarr'` filename was the only reason it couldn't run.) + +- [ ] **Step 2: Run it** + +Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[cli.md:cli-open]" -v` +Expected: PASS + +- [ ] **Step 3: Commit** + +```bash +git add docs/user-guide/cli.md +git commit -m "docs: make cli zarr.open example runnable against a local store" +``` + +--- + +## Task 7: Enable the GPU block (env-gated via `gpu` marker) + +**Files:** +- Modify: `docs/user-guide/gpu.md:19-28` + +- [ ] **Step 1: Tag the GPU block** + +Change the fence from ```` ```python ```` to: + +````markdown +```python exec="true" session="gpu-demo" markers="gpu" source="above" +```` + +(Body unchanged: `import cupy as cp`, `zarr.config.enable_gpu()`, `create_array("memory://gpu-demo", ...)`, etc.) + +- [ ] **Step 2: Confirm it is SKIPPED in the default doctest env (no GPU)** + +Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[gpu.md:gpu-demo]" -v` +Expected: SKIPPED (the `gpu` marker is deselected without `-m gpu`), **not** an error, **not** absent. + +- [ ] **Step 3: Confirm it is COLLECTED for the gpu selection** + +Run: `hatch run doctest:test -m gpu --co -q | grep gpu-demo` +Expected: the `gpu.md:gpu-demo` case is collected (it will actually execute only on real GPU hardware in the `gputest` env, which we can't run here). + +- [ ] **Step 4: Commit** + +```bash +git add docs/user-guide/gpu.md +git commit -m "docs: execute gpu example under the gpu marker" +``` + +--- + +## Task 8: Fix the `exec="on"` typo and opt out the genuinely-non-executable blocks + +**Files:** +- Modify: `docs/contributing.md:15` (pseudocode) and `docs/contributing.md:231` (`exec="on"` typo) +- Modify: `docs/user-guide/data_types.md:363` (REPL transcript) +- Modify: `docs/user-guide/examples/custom_dtype.md:5` (`--8<--` include) +- Modify: `docs/user-guide/v3_migration.md:42` (intentionally-wrong import) + +- [ ] **Step 1: Fix the `exec="on"` typo in `contributing.md:231`** + +Change the fence attribute `exec="on"` to `exec="true"`. Then run that block to confirm it actually executes cleanly: + +Run: `hatch run doctest:test -v -k contributing` +Expected: the formerly-`exec="on"` block now runs. **If it fails** (the code was broken too, having never run), fix the code in the block minimally so it passes, or — if it's not meant to run — convert it to `exec="false" reason="..."`. Record which in the commit. + +- [ ] **Step 2: Opt out `contributing.md:15` (pseudocode)** + +Change ```` ```python ```` to: + +````markdown +```python exec="false" reason="illustrative pseudocode with a '# etc.' placeholder, not runnable" +```` + +- [ ] **Step 3: Opt out `data_types.md:363` (REPL transcript)** + +Change ```` ```python ```` to: + +````markdown +```python exec="false" reason="REPL output transcript, not executable source" +```` + +- [ ] **Step 4: Opt out `custom_dtype.md:5` (`--8<--` include)** + +Change ```` ```python ```` to: + +````markdown +```python exec="false" reason="pymdownx snippet include directive, not python source" +```` + +- [ ] **Step 5: Opt out `v3_migration.md:42` (intentionally-wrong import)** + +Change ```` ```python ```` to: + +````markdown +```python exec="false" reason="intentionally shows the old/incorrect import for contrast" +```` + +- [ ] **Step 6: Commit** + +```bash +git add docs/contributing.md docs/user-guide/data_types.md docs/user-guide/examples/custom_dtype.md docs/user-guide/v3_migration.md +git commit -m "docs: fix exec=on typo and explicitly opt out non-runnable blocks" +``` + +--- + +## Task 9: Handle the dask block in performance.md + +`docs/user-guide/performance.md:263` uses `dask.array` and opens `'data/large_array.zarr'` (nonexistent). Two viable dispositions — pick based on whether `dask` is in the doctest env. + +**Files:** +- Modify: `docs/user-guide/performance.md:263-280` + +- [ ] **Step 1: Check whether dask is available in the doctest env** + +Run: `hatch run doctest:list-env | grep -i dask` +Expected: either shows a `dask` line (available) or nothing (not available). + +- [ ] **Step 2a: If dask IS available — make it runnable** + +Replace the `'data/large_array.zarr'` open with a created array, keeping the dask demonstration: + +````markdown +```python exec="true" session="perf-dask" source="above" +import zarr +import dask.array as da + +zarr.config.set({ + 'async.concurrency': 4, + 'threading.max_workers': 4, +}) + +# create a small array to read with Dask +zarr.create_array("data/perf-dask-demo.zarr", shape=(16, 16), chunks=(8, 8), dtype="f4") +z = zarr.open_array("data/perf-dask-demo.zarr", mode="r") + +arr = da.from_array(z, chunks=z.chunks) +result = arr.mean(axis=0).compute() +``` +```` + +Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[performance.md:perf-dask]" -v` +Expected: PASS + +- [ ] **Step 2b: If dask is NOT available — opt out with a reason** + +Change ```` ```python ```` to: + +````markdown +```python exec="false" reason="requires dask, which is not in the docs test environment" +```` + +- [ ] **Step 3: Commit** + +```bash +git add docs/user-guide/performance.md +git commit -m "docs: make dask performance example runnable (or opt out if dask absent)" +``` + +--- + +## Task 10: Add the guard test + +The guard asserts every python block in `docs/` is either `exec="true"` or `exec="false"` with a non-empty `reason`. Anything else (bare, `exec="on"`, missing reason) fails. + +**Files:** +- Modify: `tests/test_docs.py` + +- [ ] **Step 1: Write the guard test** + +Add to `tests/test_docs.py`: + +```python +def test_no_unvalidated_blocks() -> None: + """Every python code block in docs/ must declare its validation state: + either exec="true" (it is executed as a test) or exec="false" with a reason + (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on") + fails here, so a block can never silently opt out of validation — the gap + that hid the invalid create_array(mode="w") example in #4016.""" + offenders: list[str] = [] + for example in find_examples(str(DOCS_ROOT)): + settings = example.prefix_settings() + exec_val = settings.get("exec") + loc = f"{Path(example.path).relative_to(DOCS_ROOT)}:{example.start_line}" + if exec_val == "true": + continue + if exec_val == "false" and settings.get("reason", "").strip(): + continue + offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})") + + assert not offenders, ( + "Docs python blocks must be exec=\"true\" or exec=\"false\" with a reason:\n" + + "\n".join(offenders) + ) +``` + +(`find_examples` from pytest-examples only yields fenced code blocks for languages it recognizes as runnable, which includes python; confirm in Step 2 that the count matches the audit. If it also yields non-python fences, filter on `example.prefix` / language — adjust to `if not str(example.path).endswith(".md"): continue` is unnecessary since DOCS_ROOT is all markdown.) + +- [ ] **Step 2: Run the guard — it must PASS now that all blocks are remediated** + +Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v` +Expected: PASS (zero offenders). **If it lists offenders**, they are blocks missed by Tasks 4-9 — fix each (turn on or opt out) until the list is empty. + +- [ ] **Step 3: Negative check — confirm the guard actually catches a bare block** + +Temporarily add a bare block to any docs file: + +````markdown +```python +1 / 0 +``` +```` + +Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v` +Expected: FAIL, listing the new bare block's location. + +Then remove the temporary block and re-run: +Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v` +Expected: PASS + +- [ ] **Step 4: Commit** + +```bash +git add tests/test_docs.py +git commit -m "test: guard that every docs python block is executed or opted out (#4017)" +``` + +--- + +## Task 11: Full suite + news fragment + +**Files:** +- Create: `changes/4016.bugfix.md` + +- [ ] **Step 1: Run the entire docs test suite** + +Run: `hatch run doctest:test -v` +Expected: PASS — all `exec="true"` sessions run (S3 against moto; config/cli/dask as applicable), the GPU session reports SKIPPED, and the guard passes. + +- [ ] **Step 2: Add the towncrier news fragment** + +Create `changes/4016.bugfix.md`: + +```markdown +Fixed an invalid ``zarr.create_array`` example in the quick-start docs (it passed an unsupported ``mode`` argument) and made the cloud-storage example execute against a mock S3 backend in CI. Added a test ensuring every python code block in the docs is either executed or explicitly opted out with a documented reason. +``` + +- [ ] **Step 3: Run the full prek/lint pass** + +Run: `prek run --all-files` +Expected: PASS (ruff, mypy, towncrier-check, etc. all green). + +- [ ] **Step 4: Commit** + +```bash +git add changes/4016.bugfix.md +git commit -m "docs: add news fragment for docs-block validation (#4016, #4017)" +``` + +--- + +## Self-review notes (resolved during planning) + +- **Spec coverage:** Part A (remediate 12 blocks) → Tasks 4-9; Part B (guard) → Task 10. Marker-bound execution (s3 + gpu) → Tasks 2, 3, 4, 7. Spike #1 → Task 1. `pyproject.toml` s3 marker → Task 2. All three spec spikes are addressed: #1 in Task 1; #2 (markdown-exec tolerance of `markers=`) is implicitly verified by `hatch run docs:build` — **add a build check**: see Task 11 Step 1 note below; #3 (moto teardown) handled by the fixture's `finally` block in Task 3 Step 5. +- **Spike #2 verification:** `markers=` and `reason=`/`exec="false"` are unknown attributes to markdown-exec; it ignores unrecognized prefix settings and only acts on `exec="true"`. Confirm by running `hatch run docs:build` once after Task 11 and checking it succeeds and that the gpu/s3 blocks render as static source. If the build errors on unknown attributes, fall back to the per-session marker map (spec fallback for spike #2). +- **The 12 blocks, accounted for:** quick-start S3 (T4), perf×2 config (T5), arrays config (T5), cli (T6), gpu (T7), contributing exec=on typo + pseudocode (T8), data_types transcript (T8), custom_dtype include (T8), v3_migration wrong-import (T8), perf dask (T9). = 12. ✓ +- **Naming consistency:** `_session_params`, `_markers_for`, `docs_s3_backend`, `test_no_unvalidated_blocks`, `S3_BUCKET="example-bucket"` (matches the URL in the T4 block) used consistently across tasks. From 460385d207f4a1cdc421b30836c3fafd5a8bb7cf Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 10:54:57 +0200 Subject: [PATCH 05/25] test: spike s3 default-endpoint mechanism for docs (no storage_options) Result: A-env-var A bare zarr.create_array("s3://bucket/key", ...) with NO storage_options reaches a moto server when AWS_ENDPOINT_URL is set process-wide. s3fs/ aiobotocore honor the env var, so the visible docs block can stay clean. Caveat: moto[s3,server] currently lives only in the 'remote-tests' dependency group, not 'test'; the doctest hatch env (dependency-groups= ['test']) does NOT have moto installed. The downstream real-fixture task must add moto[s3,server] (and requests) to the doctest env extras. Co-Authored-By: Claude Opus 4.8 (1M context) From 2032fee17d9180bc1f2478690f0a777910b852cf Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 10:58:02 +0200 Subject: [PATCH 06/25] test: register s3 pytest marker --- pyproject.toml | 1 + 1 file changed, 1 insertion(+) diff --git a/pyproject.toml b/pyproject.toml index e342e8305c..65f3336073 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -446,6 +446,7 @@ filterwarnings = [ markers = [ "asyncio: mark test as asyncio test", "gpu: mark a test as requiring CuPy and GPU", + "s3: mark a test as requiring a (mock) S3 backend via moto", "slow_hypothesis: slow hypothesis tests", ] From 42a8ebd118736615036afccf76ddfdec582b8ca9 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:02:54 +0200 Subject: [PATCH 07/25] test: parse markers= on docs blocks and add moto s3 fixture binding Co-Authored-By: Claude Opus 4.8 (1M context) --- pyproject.toml | 3 +- tests/test_docs.py | 120 +++++++++++++++++++++++++++++++++++---------- 2 files changed, 94 insertions(+), 29 deletions(-) diff --git a/pyproject.toml b/pyproject.toml index 65f3336073..bc95bfd61b 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -277,9 +277,8 @@ readthedocs = "rm -rf $READTHEDOCS_OUTPUT/html && cp -r site $READTHEDOCS_OUTPUT [tool.hatch.envs.doctest] description = "Test environment for validating executable code blocks in documentation" features = ['remote'] -dependency-groups = ['test'] +dependency-groups = ['remote-tests'] extra-dependencies = [ - "s3fs>=2023.10.0", "pytest-examples", ] diff --git a/tests/test_docs.py b/tests/test_docs.py index d467e478e8..e2fac43a32 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -7,14 +7,19 @@ from __future__ import annotations +import os from collections import defaultdict from pathlib import Path +from typing import TYPE_CHECKING, Any import pytest pytest.importorskip("pytest_examples") from pytest_examples import CodeExample, EvalExample, find_examples +if TYPE_CHECKING: + from collections.abc import Generator + # Find all markdown files with executable code blocks DOCS_ROOT = Path(__file__).parent.parent / "docs" SOURCES_ROOT = Path(__file__).parent.parent / "src" / "zarr" @@ -36,46 +41,104 @@ def find_markdown_files_with_exec() -> list[Path]: return sorted(markdown_files) -def group_examples_by_session() -> list[tuple[str, str]]: - """ - Group examples by their session and file, maintaining order. +def name_example(path: str, session: str) -> str: + """Generate a readable name for a test case from file path and session.""" + file = Path(path) + try: + file = file.relative_to(DOCS_ROOT) + except ValueError: + # Path is outside DOCS_ROOT (e.g. a tmp_path fixture in unit tests); use the + # bare file name rather than an absolute path for a stable, readable id. + file = Path(file.name) + return f"{file}:{session}" - Returns a list of session_key tuples where session_key is - (file_path, session_name). - """ - all_examples = list(find_examples(DOCS_ROOT)) - # Group by file and session - sessions = defaultdict(list) +def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]: + """Translate a block's markers="a b" attribute into pytest mark decorators.""" + raw = settings.get("markers", "") + return [getattr(pytest.mark, name) for name in raw.split() if name] - for example in all_examples: + +def _session_params(root: Path) -> list[Any]: + """Group exec="true" examples by (file, session) and emit one pytest.param per + session, carrying the union of markers declared by that session's blocks.""" + sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list) + marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set) + + for example in find_examples(str(root)): settings = example.prefix_settings() if settings.get("exec") != "true": continue - - # Use file path and session name as key - file_path = example.path session_name = settings.get("session", "_default") - session_key = (str(file_path), session_name) - - sessions[session_key].append(example) - - # Return sorted list of session keys for consistent test ordering - return sorted(sessions.keys(), key=lambda x: (x[0], x[1])) - - -def name_example(path: str, session: str) -> str: - """Generate a readable name for a test case from file path and session.""" - return f"{Path(path).relative_to(DOCS_ROOT)}:{session}" + key = (str(example.path), session_name) + sessions[key].append(example) + for mark in _markers_for(settings): + marks_by_session[key].add(mark.name) + + params = [] + for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])): + marks = tuple(getattr(pytest.mark, name) for name in sorted(marks_by_session[key])) + params.append(pytest.param(key, marks=marks, id=name_example(key[0], key[1]))) + return params + + +S3_PORT = 5556 +S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/" +S3_BUCKET = "example-bucket" + + +@pytest.fixture +def docs_s3_backend() -> Generator[None, None, None]: + """Stand up a moto mock-S3 server and set a process-wide default endpoint so docs + blocks can use a bare s3:// URL with no storage_options (see spike in plan Task 1).""" + moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") + s3fs = pytest.importorskip("s3fs") + botocore = pytest.importorskip("botocore") + requests = pytest.importorskip("requests") + + prev_endpoint = os.environ.get("AWS_ENDPOINT_URL") + server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT) + server.start() + try: + os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") + os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") + os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT + + session = botocore.session.Session() + client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1") + client.create_bucket(Bucket=S3_BUCKET) + client.close() + s3fs.S3FileSystem.clear_instance_cache() + yield + finally: + requests.post(f"{S3_ENDPOINT}/moto-api/reset") + if prev_endpoint is None: + os.environ.pop("AWS_ENDPOINT_URL", None) + else: + os.environ["AWS_ENDPOINT_URL"] = prev_endpoint + server.stop() + + +def test_markers_attribute_is_parsed(tmp_path: Path) -> None: + """A block tagged markers="s3" must surface that marker on its parametrized case, + so pytest can gate/bind it (e.g. attach the moto fixture).""" + md = tmp_path / "ex.md" + md.write_text( + '```python exec="true" session="demo" markers="s3"\nimport zarr\n```\n', + encoding="utf-8", + ) + params = _session_params(md.parent) + assert len(params) == 1 + marks = params[0].marks + assert any(m.name == "s3" for m in marks) # Get all example sessions -@pytest.mark.parametrize( - "session_key", group_examples_by_session(), ids=lambda v: name_example(v[0], v[1]) -) +@pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT)) def test_documentation_examples( session_key: tuple[str, str], eval_example: EvalExample, + request: pytest.FixtureRequest, ) -> None: """ Test that all exec="true" code examples in documentation execute successfully. @@ -90,6 +153,9 @@ def test_documentation_examples( - Execute them in order within the same context - Verify no exceptions are raised """ + if request.node.get_closest_marker("s3") is not None: + request.getfixturevalue("docs_s3_backend") + file_path, session_name = session_key # Get examples for this session From 7489bea33e2648e3dad43ccabefc6b662eb5e65c Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:09:04 +0200 Subject: [PATCH 08/25] docs: fix invalid s3 create_array example and run it against moto (#4016) Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/quick-start.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/quick-start.md b/docs/quick-start.md index 27dc8e6045..efde321af6 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -131,11 +131,13 @@ Zarr integrates seamlessly with cloud object storage such as Amazon S3 and Googl using external libraries like [s3fs](https://s3fs.readthedocs.io) or [gcsfs](https://gcsfs.readthedocs.io): -```python - -import s3fs +```python exec="true" session="s3demo" markers="s3" source="above" +import zarr +import numpy as np -z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10), dtype="f4") +z = zarr.create_array( + "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4" +) z[:, :] = np.random.random((100, 100)) ``` From 198eb802396972628fb0585d31e0ecf616d6854a Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:11:27 +0200 Subject: [PATCH 09/25] docs: execute config-setting examples in performance.md and arrays.md Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/user-guide/arrays.md | 2 +- docs/user-guide/performance.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/user-guide/arrays.md b/docs/user-guide/arrays.md index 14122003c0..dd1788b7d2 100644 --- a/docs/user-guide/arrays.md +++ b/docs/user-guide/arrays.md @@ -619,7 +619,7 @@ Without the `shards` argument, there would be 10,000 chunks stored as individual Because the feature is still stabilizing, it is disabled by default and must be explicitly enabled: - ```python + ```python exec="true" session="arrays-rectilinear" import zarr zarr.config.set({"array.rectilinear_chunks": True}) ``` diff --git a/docs/user-guide/performance.md b/docs/user-guide/performance.md index fa98e9466e..f22ec00d02 100644 --- a/docs/user-guide/performance.md +++ b/docs/user-guide/performance.md @@ -204,7 +204,7 @@ determines the maximum number of concurrent I/O operations. The default value is 10, which is a conservative value. You may get improved performance by tuning the concurrency limit. You can adjust this value based on your specific needs: -```python +```python exec="true" session="perf-concurrency" import zarr # Set concurrency for the current session @@ -234,7 +234,7 @@ By default it is `None`, which lets Python choose the pool size (typically You can set it explicitly when you want more predictable resource usage: -```python +```python exec="true" session="perf-workers" import zarr zarr.config.set({'threading.max_workers': 8}) From c8836c35f17cc2f21ba039f78b8a96e7499cc254 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:13:29 +0200 Subject: [PATCH 10/25] docs: make cli zarr.open example runnable against a local store --- docs/user-guide/cli.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/user-guide/cli.md b/docs/user-guide/cli.md index fc812c1a20..743392f679 100644 --- a/docs/user-guide/cli.md +++ b/docs/user-guide/cli.md @@ -45,9 +45,13 @@ This will write new `zarr.json` files to `input.zarr`, leaving the existing v2 m To open the array/group using the new metadata use: -```python +```python exec="true" session="cli-open" source="above" import zarr -zarr_with_v3_metadata = zarr.open('path/to/input.zarr', zarr_format=3) + +# create a small array to open (stands in for the migrated store) +zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4") + +zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3) ``` Once you are happy with the conversion, you can run the following to remove the old v2 metadata: From 010d99ace52b5c54fbd98d98327c65245828e2c2 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:17:48 +0200 Subject: [PATCH 11/25] docs: execute gpu example under the gpu marker Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/user-guide/gpu.md | 2 +- tests/test_docs.py | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/user-guide/gpu.md b/docs/user-guide/gpu.md index 6189f39d3d..1dc3ef296b 100644 --- a/docs/user-guide/gpu.md +++ b/docs/user-guide/gpu.md @@ -16,7 +16,7 @@ Zarr can use GPUs to accelerate your workload by running `zarr.Config.enable_gpu [`zarr.config`][] configures Zarr to use GPU memory for the data buffers used internally by Zarr via `enable_gpu()`. -```python +```python exec="true" session="gpu-demo" markers="gpu" source="above" import zarr import cupy as cp zarr.config.enable_gpu() diff --git a/tests/test_docs.py b/tests/test_docs.py index e2fac43a32..383456a30f 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -153,6 +153,9 @@ def test_documentation_examples( - Execute them in order within the same context - Verify no exceptions are raised """ + if request.node.get_closest_marker("gpu") is not None: + pytest.importorskip("cupy") + if request.node.get_closest_marker("s3") is not None: request.getfixturevalue("docs_s3_backend") From 79197c4c39a870146f5e54aad2592beed8ed8992 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:23:47 +0200 Subject: [PATCH 12/25] docs: fix exec=on typo and explicitly opt out non-runnable blocks Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/contributing.md | 6 +++--- docs/user-guide/data_types.md | 2 +- docs/user-guide/examples/custom_dtype.md | 2 +- docs/user-guide/v3_migration.md | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/contributing.md b/docs/contributing.md index b9c7aa1aa2..a37768b815 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -12,7 +12,7 @@ If you find a bug, please raise a [GitHub issue](https://github.com/zarr-develop 1. A minimal, self-contained snippet of Python code reproducing the problem. You can format the code nicely using markdown, e.g.: -```python +```python exec="false" reason="illustrative pseudocode with a '# etc.' placeholder, not runnable" import zarr g = zarr.group() # etc. @@ -225,10 +225,10 @@ hatch --env docs run serve #### Adding executable code blocks in the documentation -Zarr uses [Markdown Exec](https://pawamoy.github.io/markdown-exec/usage/) to execute code blocks in Markdown files. Add `exec="on"` to a code block header for it to be executed when the docs are built. For example: +Zarr uses [Markdown Exec](https://pawamoy.github.io/markdown-exec/usage/) to execute code blocks in Markdown files. Add `exec="true"` to a code block header for it to be executed when the docs are built. For example: ````md -```python exec="on" +```python exec="true" print("Hello world") ``` ```` diff --git a/docs/user-guide/data_types.md b/docs/user-guide/data_types.md index 3e10845979..6f6bb05033 100644 --- a/docs/user-guide/data_types.md +++ b/docs/user-guide/data_types.md @@ -360,7 +360,7 @@ print(type(a.dtype)) But if we inspect the metadata for the array, we can see the Zarr data type object: -```python +```python exec="false" reason="REPL output transcript, not executable source" type(a.metadata.data_type) ``` diff --git a/docs/user-guide/examples/custom_dtype.md b/docs/user-guide/examples/custom_dtype.md index d6736e25dd..391407b822 100644 --- a/docs/user-guide/examples/custom_dtype.md +++ b/docs/user-guide/examples/custom_dtype.md @@ -2,6 +2,6 @@ ## Source Code -```python +```python exec="false" reason="pymdownx snippet include directive, not python source" --8<-- "examples/custom_dtype/custom_dtype.py" ``` diff --git a/docs/user-guide/v3_migration.md b/docs/user-guide/v3_migration.md index 21386c1522..1680547d93 100644 --- a/docs/user-guide/v3_migration.md +++ b/docs/user-guide/v3_migration.md @@ -39,7 +39,7 @@ the following actions in order: - `numcodecs.*` will no longer be available in `zarr.*`. To migrate, import codecs directly from `numcodecs`: - ```python + ```python exec="false" reason="intentionally shows the old/incorrect import for contrast" from numcodecs import Blosc # instead of: # from zarr import Blosc From 9165bd5896a1a79f90ff49e9e00f52c2f1764ead Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:28:31 +0200 Subject: [PATCH 13/25] docs: make dask performance example runnable (or opt out if dask absent) --- docs/user-guide/performance.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/performance.md b/docs/user-guide/performance.md index f22ec00d02..3357913557 100644 --- a/docs/user-guide/performance.md +++ b/docs/user-guide/performance.md @@ -260,7 +260,7 @@ For example, if you're running Dask with 10 threads and Zarr's default concurren **Recommendation**: When using Dask with many threads, configure Zarr's concurrency settings: -```python +```python exec="false" reason="requires dask, which is not in the docs test environment" import zarr import dask.array as da From 19f0238dbcdb538521f0150e72cf782f16c8d00a Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:28:52 +0200 Subject: [PATCH 14/25] docs: record plan corrections from execution (spike result, gpu marker mechanism) Co-Authored-By: Claude Opus 4.8 (1M context) --- .../plans/2026-05-29-docs-block-validation.md | 31 +++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/docs/superpowers/plans/2026-05-29-docs-block-validation.md b/docs/superpowers/plans/2026-05-29-docs-block-validation.md index 1c34652012..63ab0b696d 100644 --- a/docs/superpowers/plans/2026-05-29-docs-block-validation.md +++ b/docs/superpowers/plans/2026-05-29-docs-block-validation.md @@ -129,6 +129,20 @@ git commit -m "test: spike s3 default-endpoint mechanism for docs (no storage_op Result: " ``` +### RESULT (completed 2026-05-29, commit 460385d) + +- **Mechanism A won:** `os.environ["AWS_ENDPOINT_URL"] = ENDPOINT` (+ dummy + `AWS_SECRET_ACCESS_KEY`/`AWS_ACCESS_KEY_ID`) makes a bare + `create_array("s3://...")` reach moto with no `storage_options`. `clear_instance_cache()` + alone sufficed; the `set_session()`/`skip_instance_cache` dance from `test_fsspec.py` + was not needed for a single fixture. No event-loop or teardown warnings observed. +- **PLAN CORRECTION (important):** the `doctest` hatch env does **NOT** install moto. + `moto[s3,server]` is only in the `remote-tests` dependency group; the `doctest` env + (`pyproject.toml` ~line 277-284) has only `s3fs` + `pytest-examples` as extras. **Task 3 + MUST add `moto[s3,server]` and `requests` to `[tool.hatch.envs.doctest] extra-dependencies`**, + or any moto-backed doctest will silently `importorskip`-skip — defeating the purpose. + Task 4's verification must assert the S3 case **runs**, not skips. + --- ## Task 2: Register the `s3` pytest marker @@ -484,10 +498,23 @@ Change the fence from ```` ```python ```` to: (Body unchanged: `import cupy as cp`, `zarr.config.enable_gpu()`, `create_array("memory://gpu-demo", ...)`, etc.) +> **PLAN CORRECTION (found during execution, commit 010d99a):** a registered marker +> does NOT auto-skip a test under plain `pytest` — markers only *filter* when you pass +> `-m`. Without a guard, the gpu block runs and FAILS with `ModuleNotFoundError: cupy` +> (cupy is darwin-excluded). The repo's real convention is `pytest.importorskip("cupy")` +> in the test body (cf. `tests/conftest.py:183`). So Task 7 also adds to +> `test_documentation_examples` (mirroring the `s3` binding): +> ```python +> if request.node.get_closest_marker("gpu") is not None: +> pytest.importorskip("cupy") +> ``` +> This converts the missing-cupy hard error into a proper SKIP in the default env, while +> `-m gpu` in the `gputest` env still collects+runs it on real hardware. + - [ ] **Step 2: Confirm it is SKIPPED in the default doctest env (no GPU)** -Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[gpu.md:gpu-demo]" -v` -Expected: SKIPPED (the `gpu` marker is deselected without `-m gpu`), **not** an error, **not** absent. +Run: `hatch -e doctest run pytest "tests/test_docs.py::test_documentation_examples[user-guide/gpu.md:gpu-demo]" -v` +Expected: SKIPPED (via `importorskip("cupy")`), **not** an error, **not** absent. - [ ] **Step 3: Confirm it is COLLECTED for the gpu selection** From f90c2b012ef5780d30d1a40f58afb76db8f41432 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 11:32:09 +0200 Subject: [PATCH 15/25] test: guard that every docs python block is executed or opted out (#4017) Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_docs.py | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/tests/test_docs.py b/tests/test_docs.py index 383456a30f..21d9b3f702 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -133,6 +133,35 @@ def test_markers_attribute_is_parsed(tmp_path: Path) -> None: assert any(m.name == "s3" for m in marks) +def test_no_unvalidated_blocks() -> None: + """Every python code block in docs/ must declare its validation state: + either exec="true" (it is executed as a test) or exec="false" with a reason + (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on") + fails here, so a block can never silently opt out of validation -- the gap + that hid the invalid create_array(mode="w") example in #4016.""" + offenders: list[str] = [] + for example in find_examples(str(DOCS_ROOT)): + rel = Path(example.path).relative_to(DOCS_ROOT) + # docs/superpowers/ holds design-doc caches (plans/specs), not published + # documentation -- it is not in the mkdocs nav -- so its illustrative + # fences are not subject to the execution guard. + if rel.parts and rel.parts[0] == "superpowers": + continue + settings = example.prefix_settings() + exec_val = settings.get("exec") + loc = f"{rel}:{example.start_line}" + if exec_val == "true": + continue + if exec_val == "false" and settings.get("reason", "").strip(): + continue + offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})") + + assert not offenders, ( + 'Docs python blocks must be exec="true" or exec="false" with a reason:\n' + + "\n".join(offenders) + ) + + # Get all example sessions @pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT)) def test_documentation_examples( From cfa566542a95f181540d9ea250275db865bad037 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 12:13:39 +0200 Subject: [PATCH 16/25] docs: separate `test` flag from `exec` so infra-bound examples don't break the build markdown-exec's `exec="true"` means "run at build to render output"; build runners have no GPU/cupy and no moto server, so tagging the GPU/S3 examples exec="true" made `mkdocs build --strict` abort. Introduce a separate `test="true"` flag that our tests/test_docs.py harness keys on (markdown-exec ignores it): a block is validated if exec="true" OR test="true". The GPU and S3 examples become test="true" (+markers) and are no longer run at build. Also: a test="true"-only python fence placed before an exec="true" block of the same page disrupts markdown-exec's build execution of the later block (the quickstart ZipStore example failed with FileNotFoundError). Move the S3 example to the end of quick-start.md so no shared-session exec block follows it; document the constraint in the guard docstring and the design spec. Verified: full docs test suite green (57 passed, 2 skipped), `mkdocs build --strict` exits 0, prek (ruff/mypy/...) clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/quick-start.md | 28 +++--- ...2026-05-29-docs-block-validation-design.md | 71 ++++++++++++---- docs/user-guide/gpu.md | 2 +- tests/test_docs.py | 85 +++++++++++-------- 4 files changed, 119 insertions(+), 67 deletions(-) diff --git a/docs/quick-start.md b/docs/quick-start.md index efde321af6..0bad4f2e34 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -127,20 +127,6 @@ be done in a separate step. Zarr supports persistent storage to disk or cloud-compatible backends. While examples above utilized a [`zarr.storage.LocalStore`][], a number of other storage options are available. -Zarr integrates seamlessly with cloud object storage such as Amazon S3 and Google Cloud Storage -using external libraries like [s3fs](https://s3fs.readthedocs.io) or -[gcsfs](https://gcsfs.readthedocs.io): - -```python exec="true" session="s3demo" markers="s3" source="above" -import zarr -import numpy as np - -z = zarr.create_array( - "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4" -) -z[:, :] = np.random.random((100, 100)) -``` - A single-file store can also be created using the [`zarr.storage.ZipStore`][]: ```python exec="true" session="quickstart" source="above" @@ -175,4 +161,18 @@ z = zarr.open_array(store, mode='r') print(z[:]) ``` +Zarr also integrates seamlessly with cloud object storage such as Amazon S3 and Google +Cloud Storage using external libraries like [s3fs](https://s3fs.readthedocs.io) or +[gcsfs](https://gcsfs.readthedocs.io): + +```python test="true" session="s3demo" markers="s3" source="above" +import zarr +import numpy as np + +z = zarr.create_array( + "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4" +) +z[:, :] = np.random.random((100, 100)) +``` + Read more about Zarr's storage options in the [User Guide](user-guide/index.md). diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md index 4e8232846f..643c454741 100644 --- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md +++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md @@ -100,13 +100,54 @@ marker-bound execution look hard: - **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is where markers live, where infra fixtures bind, and where env-gating happens. +## Two flags: `exec` (render output) vs `test` (validate) + +A code block can be *run* for two unrelated reasons, and conflating them breaks the +build. They are separate fence attributes: + +- **`exec="true"`** — markdown-exec executes the block **at docs-build time to render its + output** into the published page. This is markdown-exec's own attribute (it hard-codes + the name `exec`, see `markdown_exec/_internal/main.py`), so we cannot rename it. Read it + as *"execute to render output."* +- **`test="true"`** — **our** `tests/test_docs.py` harness executes the block **as a + validation test**. markdown-exec does not recognize `test=` and ignores it. + +Why two: a block that needs special infra to run (GPU/cupy, or S3) must be **validated in +tests** but must **not run at build** — build runners have no GPU and no moto server, so +an `exec="true"` GPU block makes `mkdocs build --strict` abort (`ModuleNotFoundError: +cupy`). Separating the flags lets such a block be `test="true"` (tested) without +`exec="true"` (so it renders as static source at build, never executed there). + +**Harness rule:** a block is collected as a test if `exec="true"` **OR** `test="true"`. +So existing `exec="true"` example blocks stay tested as before (backward-compatible), and +test-only blocks add `test="true"` without `exec`. + +**The combinations:** + +| Block | `exec` | `test` | Effect | +|---|---|---|---| +| Tutorial examples (quickstart, config, …) | `true` | — | Run at build (render output); also tested. | +| GPU / S3 examples | — | `true` | Tested (under markers); rendered static at build. | +| Non-runnable (transcript, include, wrong-import) | `false`+`reason` | — | Neither; explicit reasoned opt-out. | + +**Placement constraint (markdown-exec quirk).** markdown-exec's SuperFences validator +*rejects* a `python` fence that lacks `exec="true"` (returns `False`, so it is not run at +build). A rejected fence positioned **before** an `exec="true"` block of the same page +disrupts markdown-exec's build-time execution of that later block — observed concretely: a +`test="true"` S3 block placed above the quickstart `ZipStore` example made the ZipStore +block fail at build (`FileNotFoundError`, the zip was never written) and `mkdocs build +--strict` aborted. Fix: **a `test="true"`-only block must come last on its page** (or be +the only python block on the page, as on `gpu.md`). The S3 example is therefore placed at +the end of `quick-start.md`. The guard test docstring records this. + ## Marker-bound execution -A block declares the pytest marker it needs via a **fence attribute**, e.g.: +A block declares the pytest marker it needs via a **fence attribute**. Marker-bound +blocks are `test="true"` (validated) but **not** `exec="true"` (not build-run), e.g.: ```` -```python exec="true" markers="gpu" -```python exec="true" markers="s3" +```python test="true" markers="gpu" source="above" +```python test="true" markers="s3" source="above" ```` `group_examples_by_session()` parses `markers=` and emits @@ -114,11 +155,12 @@ A block declares the pytest marker it needs via a **fence attribute**, e.g.: whatever that marker means** — and the two markers mean different things, which is the point of unifying the model rather than special-casing each: -- **`gpu` — env-gate.** Default `doctest` env runs `pytest` → the gpu-marked param is - **skipped/deselected**, exactly like every other `gpu`-marked test in `tests/`. The - `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy. Reuses - the existing `gpu` marker (`pyproject.toml` `markers` table) and `pytest -m gpu` - selection — no new harness concept. +- **`gpu` — env-gate.** A registered marker does **not** auto-skip under plain `pytest` + (markers only *filter* when you pass `-m`). The repo's convention is + `pytest.importorskip("cupy")` in the test body (cf. `tests/conftest.py`), so the harness + calls `importorskip("cupy")` for gpu-marked docs cases: in the default `doctest` env the + case is **skipped** (no cupy), and `pytest -m gpu` in the `gputest` env runs it on real + cupy. The block is `test="true"` (not `exec="true"`), so it is never run at build. - **`s3` — infra-binding.** A new `s3` marker (must be registered in the `markers` table). An autouse-style fixture keyed on the marker stands up the `moto` server and @@ -135,16 +177,15 @@ hardware env, s3 → fixture), not in the declaration mechanism. ## Components & data flow **`docs/` markdown** — source of truth. Each python block is in one of three declared -states: +states (see the two-flags table above): -1. `exec="true"` (optionally `+ markers=""`) — executed as a test. -2. explicit opt-out marker **with a reason** — deliberately not executed. +1. `exec="true"` and/or `test="true"` (optionally `+ markers=""`) — validated, by + build-render and/or by the test harness. +2. `exec="false"` with a `reason="..."` — explicit, documented opt-out. 3. anything else (bare, `exec="on"`, …) — **illegal**, fails the guard. -The exact spelling of the opt-out marker (e.g. `exec="false"` plus a `reason="..."` -attribute, versus a dedicated sentinel attribute) is an implementation-plan decision. -Requirement: it must be explicit, greppable, carry a human-readable reason, and be a -form `markdown-exec` will not execute at build time. +The opt-out form is `exec="false" reason="..."`: explicit, greppable, carries a +human-readable reason, and is not executed by markdown-exec at build time. **`tests/test_docs.py`** — already-parametrized pytest harness. Changes: diff --git a/docs/user-guide/gpu.md b/docs/user-guide/gpu.md index 1dc3ef296b..6c26c3e564 100644 --- a/docs/user-guide/gpu.md +++ b/docs/user-guide/gpu.md @@ -16,7 +16,7 @@ Zarr can use GPUs to accelerate your workload by running `zarr.Config.enable_gpu [`zarr.config`][] configures Zarr to use GPU memory for the data buffers used internally by Zarr via `enable_gpu()`. -```python exec="true" session="gpu-demo" markers="gpu" source="above" +```python test="true" session="gpu-demo" markers="gpu" source="above" import zarr import cupy as cp zarr.config.enable_gpu() diff --git a/tests/test_docs.py b/tests/test_docs.py index 21d9b3f702..28bf2b9f6c 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -1,8 +1,12 @@ """ Tests for executable code blocks in markdown documentation. -This module uses pytest-examples to validate that all Python code examples -with exec="true" in the documentation execute successfully. +This module uses pytest-examples to validate Python code examples in the docs. A block is +validated if it renders output at build (exec="true") or is explicitly marked for testing +(test="true"); see the two-flags discussion in +docs/superpowers/specs/2026-05-29-docs-block-validation-design.md. The test_no_unvalidated_blocks +guard ensures every python block declares one of those, or an explicit exec="false" opt-out +with a reason, so a block can never silently skip validation. """ from __future__ import annotations @@ -20,27 +24,10 @@ if TYPE_CHECKING: from collections.abc import Generator -# Find all markdown files with executable code blocks DOCS_ROOT = Path(__file__).parent.parent / "docs" SOURCES_ROOT = Path(__file__).parent.parent / "src" / "zarr" -def find_markdown_files_with_exec() -> list[Path]: - """Find all markdown files containing exec="true" code blocks.""" - markdown_files = [] - - for md_file in DOCS_ROOT.rglob("*.md"): - try: - content = md_file.read_text(encoding="utf-8") - if 'exec="true"' in content: - markdown_files.append(md_file) - except Exception: - # Skip files that can't be read - continue - - return sorted(markdown_files) - - def name_example(path: str, session: str) -> str: """Generate a readable name for a test case from file path and session.""" file = Path(path) @@ -59,15 +46,25 @@ def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]: return [getattr(pytest.mark, name) for name in raw.split() if name] +def _is_tested(settings: dict[str, str]) -> bool: + """A block is validated by our pytest harness if it is run at build to render output + (exec="true") OR explicitly marked for testing (test="true"). The two flags are + separate on purpose: exec= drives markdown-exec's build-time rendering, while test= + lets a block be validated without being run at build (e.g. gpu/s3 blocks, which the + build environment cannot run).""" + return settings.get("exec") == "true" or settings.get("test") == "true" + + def _session_params(root: Path) -> list[Any]: - """Group exec="true" examples by (file, session) and emit one pytest.param per - session, carrying the union of markers declared by that session's blocks.""" + """Group tested examples (exec="true" or test="true") by (file, session) and emit one + pytest.param per session, carrying the union of markers declared by that session's + blocks.""" sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list) marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set) for example in find_examples(str(root)): settings = example.prefix_settings() - if settings.get("exec") != "true": + if not _is_tested(settings): continue session_name = settings.get("session", "_default") key = (str(example.path), session_name) @@ -120,11 +117,13 @@ def docs_s3_backend() -> Generator[None, None, None]: def test_markers_attribute_is_parsed(tmp_path: Path) -> None: - """A block tagged markers="s3" must surface that marker on its parametrized case, - so pytest can gate/bind it (e.g. attach the moto fixture).""" + """A test="true" block tagged markers="s3" must surface that marker on its + parametrized case, so pytest can gate/bind it (e.g. attach the moto fixture). + Uses test="true" (not exec="true") because marker-bound blocks are validated by the + harness without being run at build time.""" md = tmp_path / "ex.md" md.write_text( - '```python exec="true" session="demo" markers="s3"\nimport zarr\n```\n', + '```python test="true" session="demo" markers="s3"\nimport zarr\n```\n', encoding="utf-8", ) params = _session_params(md.parent) @@ -134,11 +133,16 @@ def test_markers_attribute_is_parsed(tmp_path: Path) -> None: def test_no_unvalidated_blocks() -> None: - """Every python code block in docs/ must declare its validation state: - either exec="true" (it is executed as a test) or exec="false" with a reason - (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on") - fails here, so a block can never silently opt out of validation -- the gap - that hid the invalid create_array(mode="w") example in #4016.""" + """Every python code block in docs/ must declare its validation state: exec="true" + (run at build to render output), test="true" (validated by this harness without being + run at build), or exec="false" with a reason (explicit, documented opt-out). A bare or + mistyped fence (e.g. exec="on") fails here, so a block can never silently opt out of + validation -- the gap that hid the invalid create_array(mode="w") example in #4016. + + Note on placement: a test="true"-only block (which markdown-exec does not execute) + must not sit *before* an exec="true" block of the same page's session, or it disrupts + markdown-exec's build-time execution of the later block. Keep test-only blocks last on + the page (or on a page where they are the only python block, like gpu.md).""" offenders: list[str] = [] for example in find_examples(str(DOCS_ROOT)): rel = Path(example.path).relative_to(DOCS_ROOT) @@ -150,15 +154,21 @@ def test_no_unvalidated_blocks() -> None: settings = example.prefix_settings() exec_val = settings.get("exec") loc = f"{rel}:{example.start_line}" - if exec_val == "true": + # Validated either by build-render (exec="true") or by the test harness + # (test="true"). + if _is_tested(settings): continue + # Explicit, documented opt-out from execution. if exec_val == "false" and settings.get("reason", "").strip(): continue - offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})") + offenders.append( + f"{loc} (exec={exec_val!r}, test={settings.get('test')!r}, " + f"reason={settings.get('reason')!r})" + ) assert not offenders, ( - 'Docs python blocks must be exec="true" or exec="false" with a reason:\n' - + "\n".join(offenders) + 'Docs python blocks must be exec="true", test="true", or exec="false" with a ' + "reason:\n" + "\n".join(offenders) ) @@ -170,14 +180,15 @@ def test_documentation_examples( request: pytest.FixtureRequest, ) -> None: """ - Test that all exec="true" code examples in documentation execute successfully. + Test that all validated code examples (exec="true" or test="true") in documentation + execute successfully. This test groups examples by session (file + session name) and runs them sequentially in the same execution context, allowing code to build on previous examples. This test uses pytest-examples to: - - Find all code examples with exec="true" in markdown files + - Find all code examples marked exec="true" or test="true" in markdown files - Group them by session - Execute them in order within the same context - Verify no exceptions are raised @@ -195,7 +206,7 @@ def test_documentation_examples( examples = [] for example in all_examples: settings = example.prefix_settings() - if settings.get("exec") != "true": + if not _is_tested(settings): continue if str(example.path) == file_path and settings.get("session", "_default") == session_name: examples.append(example) From 9c74830f1107730a034aa5cc8e84f303d1c59209 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 12:13:59 +0200 Subject: [PATCH 17/25] docs: add news fragment for docs-block validation (#4016, #4017) Co-Authored-By: Claude Opus 4.8 (1M context) --- changes/4016.bugfix.md | 1 + 1 file changed, 1 insertion(+) create mode 100644 changes/4016.bugfix.md diff --git a/changes/4016.bugfix.md b/changes/4016.bugfix.md new file mode 100644 index 0000000000..01984110f7 --- /dev/null +++ b/changes/4016.bugfix.md @@ -0,0 +1 @@ +Fixed an invalid `zarr.create_array` example in the quick-start documentation (it passed an unsupported `mode` argument) and made the cloud-storage example execute against a mock S3 backend in CI. Added a test ensuring every Python code block in the documentation is either executed or explicitly opted out with a documented reason, so an invalid example can no longer go untested. From 2ad7474130957df8a68d69847caedd3b400b6ae2 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 12:25:05 +0200 Subject: [PATCH 18/25] test: harden docs_s3_backend teardown and make cli example idempotent Address roborev branch-review findings (job 186): - Medium: moto-api reset POST ran before server.stop()/env restore in the finally block; if it raised, the fixed-port server thread leaked and AWS_ENDPOINT_URL was left dangling. Nest the reset in its own try/finally so server.stop() and env restoration always run. - Low: f"{S3_ENDPOINT}/moto-api/reset" double-slashed (constant ends in /); drop the extra slash. - Low: AWS_SECRET_ACCESS_KEY/AWS_ACCESS_KEY_ID were setdefault'd but never restored; save/restore all three mutated env vars uniformly. - Low: cli.md create_array lacked overwrite=True, non-idempotent across local runs; add it. Verified: full docs suite green (57 passed, 2 skipped), s3+cli pass on repeated runs, mkdocs build --strict exits 0, prek clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/user-guide/cli.md | 2 +- tests/test_docs.py | 22 +++++++++++++++------- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/docs/user-guide/cli.md b/docs/user-guide/cli.md index 743392f679..13fcb6f1b6 100644 --- a/docs/user-guide/cli.md +++ b/docs/user-guide/cli.md @@ -49,7 +49,7 @@ To open the array/group using the new metadata use: import zarr # create a small array to open (stands in for the migrated store) -zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4") +zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4", overwrite=True) zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3) ``` diff --git a/tests/test_docs.py b/tests/test_docs.py index 28bf2b9f6c..11ba8836ec 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -93,7 +93,9 @@ def docs_s3_backend() -> Generator[None, None, None]: botocore = pytest.importorskip("botocore") requests = pytest.importorskip("requests") - prev_endpoint = os.environ.get("AWS_ENDPOINT_URL") + # Save every env var we mutate so teardown can restore the prior process state. + env_keys = ("AWS_ENDPOINT_URL", "AWS_SECRET_ACCESS_KEY", "AWS_ACCESS_KEY_ID") + prev_env = {key: os.environ.get(key) for key in env_keys} server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT) server.start() try: @@ -108,12 +110,18 @@ def docs_s3_backend() -> Generator[None, None, None]: s3fs.S3FileSystem.clear_instance_cache() yield finally: - requests.post(f"{S3_ENDPOINT}/moto-api/reset") - if prev_endpoint is None: - os.environ.pop("AWS_ENDPOINT_URL", None) - else: - os.environ["AWS_ENDPOINT_URL"] = prev_endpoint - server.stop() + # Cleanup must always run, even if the moto-api reset POST fails: stopping the + # server frees the fixed port and restoring the env avoids leaking state (and a + # stale AWS_ENDPOINT_URL) into the rest of the session. + try: + requests.post(f"{S3_ENDPOINT}moto-api/reset") + finally: + for key, value in prev_env.items(): + if value is None: + os.environ.pop(key, None) + else: + os.environ[key] = value + server.stop() def test_markers_attribute_is_parsed(tmp_path: Path) -> None: From 31fa4d7b7c1aaa204b516b7aa6d2098e88ace158 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 12:36:06 +0200 Subject: [PATCH 19/25] test: actually run the gpu docs example on GPU; align collector/guard scope Address roborev branch-review findings (job 188): - Medium: the gpu docs example ran in NO environment. The gputest env lacks pytest-examples, so test_docs.py's module-level importorskip("pytest_examples") skipped the whole module under `pytest -m gpu` -- the gpu case was never collected even on GPU hardware, yet the guard reported it "validated" via test="true". Add pytest-examples to the gputest env; confirmed gpu-demo now collects under `hatch -e gputest run pytest -m gpu --co`. - Low: _session_params (collection) didn't exclude docs/superpowers/ while the guard did -- an asymmetry that could run a cache-doc block as a real test without the guard flagging it. Extract a shared _is_published_docs() helper used by both, so collection and guard agree on scope. Verified: doctest suite green (57 passed, 2 skipped), gpu-demo collectable in gputest, mkdocs build --strict exits 0, prek clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- pyproject.toml | 4 ++++ tests/test_docs.py | 22 +++++++++++++++++----- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/pyproject.toml b/pyproject.toml index bc95bfd61b..92312d2630 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -204,6 +204,10 @@ list-env = "pip list" template = "test" extra-dependencies = [ "universal_pathlib", + # Needed so tests/test_docs.py is collectable under `pytest -m gpu`; otherwise its + # module-level importorskip("pytest_examples") skips the whole module and the gpu + # docs example is never executed on GPU hardware. + "pytest-examples", ] features = ["gpu"] diff --git a/tests/test_docs.py b/tests/test_docs.py index 11ba8836ec..07d7db94e2 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -55,6 +55,19 @@ def _is_tested(settings: dict[str, str]) -> bool: return settings.get("exec") == "true" or settings.get("test") == "true" +def _is_published_docs(path: str) -> bool: + """Whether a code example belongs to the published documentation. docs/superpowers/ + holds design-doc caches (plans/specs) that are not in the mkdocs nav; both the test + collector and the guard exclude it so they agree on what counts as real docs.""" + try: + rel = Path(path).relative_to(DOCS_ROOT) + except ValueError: + # Path is outside DOCS_ROOT (e.g. a tmp_path fixture in unit tests); treat it as + # in-scope so such tests exercise the normal path. + return True + return not (rel.parts and rel.parts[0] == "superpowers") + + def _session_params(root: Path) -> list[Any]: """Group tested examples (exec="true" or test="true") by (file, session) and emit one pytest.param per session, carrying the union of markers declared by that session's @@ -63,6 +76,8 @@ def _session_params(root: Path) -> list[Any]: marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set) for example in find_examples(str(root)): + if not _is_published_docs(str(example.path)): + continue settings = example.prefix_settings() if not _is_tested(settings): continue @@ -153,12 +168,9 @@ def test_no_unvalidated_blocks() -> None: the page (or on a page where they are the only python block, like gpu.md).""" offenders: list[str] = [] for example in find_examples(str(DOCS_ROOT)): - rel = Path(example.path).relative_to(DOCS_ROOT) - # docs/superpowers/ holds design-doc caches (plans/specs), not published - # documentation -- it is not in the mkdocs nav -- so its illustrative - # fences are not subject to the execution guard. - if rel.parts and rel.parts[0] == "superpowers": + if not _is_published_docs(str(example.path)): continue + rel = Path(example.path).relative_to(DOCS_ROOT) settings = example.prefix_settings() exec_val = settings.get("exec") loc = f"{rel}:{example.start_line}" From 9f837f883a0e2386f15d29097b9b50408a853b9e Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 13:42:58 +0200 Subject: [PATCH 20/25] test: enforce test-only block placement; drop redundant marker round-trip Address roborev branch-review findings (job 190): - Low: the "test=true-only block must not precede an exec=true block on the same page" constraint was documented but unenforced (only caught by mkdocs --strict). Add test_test_only_blocks_come_last, which fails fast/locally when a test-only block has a smaller start_line than a later same-file exec=true block. Negative- checked: it catches an exec block added after the gpu test-only block. - Low: _markers_for built pytest.MarkDecorator objects that _session_params immediately reduced to .name and rebuilt; replace with _marker_names returning raw strings, building decorators once at param time. Verified: doctest suite green (58 passed, 2 skipped), placement test passes and fails on a planted violation, prek clean, mkdocs build --strict exits 0. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_docs.py | 53 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 43 insertions(+), 10 deletions(-) diff --git a/tests/test_docs.py b/tests/test_docs.py index 07d7db94e2..31886386da 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -40,10 +40,9 @@ def name_example(path: str, session: str) -> str: return f"{file}:{session}" -def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]: - """Translate a block's markers="a b" attribute into pytest mark decorators.""" - raw = settings.get("markers", "") - return [getattr(pytest.mark, name) for name in raw.split() if name] +def _marker_names(settings: dict[str, str]) -> list[str]: + """Parse a block's markers="a b" attribute into a list of marker names.""" + return [name for name in settings.get("markers", "").split() if name] def _is_tested(settings: dict[str, str]) -> bool: @@ -84,8 +83,7 @@ def _session_params(root: Path) -> list[Any]: session_name = settings.get("session", "_default") key = (str(example.path), session_name) sessions[key].append(example) - for mark in _markers_for(settings): - marks_by_session[key].add(mark.name) + marks_by_session[key].update(_marker_names(settings)) params = [] for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])): @@ -162,10 +160,7 @@ def test_no_unvalidated_blocks() -> None: mistyped fence (e.g. exec="on") fails here, so a block can never silently opt out of validation -- the gap that hid the invalid create_array(mode="w") example in #4016. - Note on placement: a test="true"-only block (which markdown-exec does not execute) - must not sit *before* an exec="true" block of the same page's session, or it disrupts - markdown-exec's build-time execution of the later block. Keep test-only blocks last on - the page (or on a page where they are the only python block, like gpu.md).""" + A separate placement constraint is enforced by test_test_only_blocks_come_last.""" offenders: list[str] = [] for example in find_examples(str(DOCS_ROOT)): if not _is_published_docs(str(example.path)): @@ -192,6 +187,44 @@ def test_no_unvalidated_blocks() -> None: ) +def test_test_only_blocks_come_last() -> None: + """A test="true"-only block (one markdown-exec does not execute, because it lacks + exec="true") must not precede an exec="true" block in the same file. markdown-exec's + SuperFences validator rejects the unexecuted python fence, which disrupts its + build-time execution of any later exec="true" block on the page (observed: the + quickstart ZipStore example failed with FileNotFoundError, aborting `mkdocs build + --strict`). Enforcing the ordering here turns that build-only failure into a fast, + local unit failure.""" + # Collect, per published-docs file, the start lines of test-only and exec blocks. + test_only: defaultdict[str, list[int]] = defaultdict(list) + exec_lines: defaultdict[str, list[int]] = defaultdict(list) + for example in find_examples(str(DOCS_ROOT)): + if not _is_published_docs(str(example.path)): + continue + settings = example.prefix_settings() + path = str(example.path) + if settings.get("exec") == "true": + exec_lines[path].append(example.start_line) + elif settings.get("test") == "true": + test_only[path].append(example.start_line) + + offenders: list[str] = [] + for path, only_lines in test_only.items(): + rel = Path(path).relative_to(DOCS_ROOT) + last_exec = max(exec_lines.get(path, [0])) + offenders.extend( + f'{rel}:{line} (test="true" block precedes an exec="true" block at line {last_exec})' + for line in only_lines + if line < last_exec + ) + + assert not offenders, ( + 'A test="true"-only block must come after every exec="true" block in the same ' + "file (markdown-exec executes the later block at build time and a preceding " + "unexecuted python fence breaks it):\n" + "\n".join(offenders) + ) + + # Get all example sessions @pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT)) def test_documentation_examples( From f936fba0d6c8674b4f5562ff172ea5a09ae9463e Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 15:12:13 +0200 Subject: [PATCH 21/25] ci: make docs build strict; correct placement-hazard scope Address roborev branch-review finding (job 192): the placement guard's docstring claimed any non-exec="true" python fence before an exec="true" block breaks the build, but the guard only checked test="true" blocks, leaving exec="false"-before- exec="true" arrangements (data_types.md, performance.md) unguarded. Investigation (experiments + markdown-exec's SuperFences integration) established the real mechanism: a non-executed python fence (test="true" OR exec="false") before an exec="true" block disrupts build-time execution of a later *state-dependent* block (needs a cross-block dependency to surface, which is why the standalone exec="true" blocks under those exec="false" opt-outs build fine). CI ran non-strict `mkdocs build`, so such a failure would have merged as a silent warning. - ci/docs.yml: `mkdocs build` -> `mkdocs build --strict` so any build-time exec failure (including the exec="false" case) fails CI authoritatively. Verified: a clean build currently emits zero warnings, so --strict passes today. - Narrow the placement guard's docstring and the spec to the real (state-dependent) mechanism, and frame test_test_only_blocks_come_last as a conservative fast-feedback convention with --strict as the authoritative check -- no longer over-claiming. Verified: docs suite green (58 passed, 2 skipped), `mkdocs build --strict` exits 0, prek clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/docs.yml | 5 ++- ...2026-05-29-docs-block-validation-design.md | 28 +++++++++++------ tests/test_docs.py | 31 +++++++++++++------ 3 files changed, 45 insertions(+), 19 deletions(-) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 6515a2c4c2..fb5487ada4 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -24,7 +24,10 @@ jobs: persist-credentials: false - uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0 - run: uv sync --group docs - - run: uv run mkdocs build + # --strict turns warnings into errors, so a docs code block that fails to execute + # at build time (e.g. a non-exec python fence disrupting a later exec="true" block) + # fails CI instead of merging as a silent warning. + - run: uv run mkdocs build --strict env: DISABLE_MKDOCS_2_WARNING: "true" NO_MKDOCS_2_WARNING: "true" diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md index 643c454741..cbbe689994 100644 --- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md +++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md @@ -130,15 +130,25 @@ test-only blocks add `test="true"` without `exec`. | GPU / S3 examples | — | `true` | Tested (under markers); rendered static at build. | | Non-runnable (transcript, include, wrong-import) | `false`+`reason` | — | Neither; explicit reasoned opt-out. | -**Placement constraint (markdown-exec quirk).** markdown-exec's SuperFences validator -*rejects* a `python` fence that lacks `exec="true"` (returns `False`, so it is not run at -build). A rejected fence positioned **before** an `exec="true"` block of the same page -disrupts markdown-exec's build-time execution of that later block — observed concretely: a -`test="true"` S3 block placed above the quickstart `ZipStore` example made the ZipStore -block fail at build (`FileNotFoundError`, the zip was never written) and `mkdocs build ---strict` aborted. Fix: **a `test="true"`-only block must come last on its page** (or be -the only python block on the page, as on `gpu.md`). The S3 example is therefore placed at -the end of `quick-start.md`. The guard test docstring records this. +**Placement constraint (markdown-exec quirk).** markdown-exec registers a SuperFences +custom fence for `python`; its validator *rejects* any fence lacking `exec="true"` +(`exec="false"` and `test="true"` alike — both are "not executed at build"). Established by +experiment: a rejected python fence positioned **before** an `exec="true"` block disrupts +markdown-exec's build-time execution of a **later, state-dependent** block (regardless of +session). Observed concretely: any non-exec python fence inserted before the quickstart +`ZipStore` write/read pair made the read block fail (`FileNotFoundError` — the write never +took effect) and `mkdocs build --strict` aborted. The effect only surfaces with a +cross-block dependency, so it does **not** affect the standalone `exec="true"` blocks in +`data_types.md`/`performance.md` that already carry `exec="false"` opt-out blocks above +them. + +Because we cannot statically tell which later blocks are state-dependent, the response is +twofold: (1) **a `test="true"`-only block must come last on its page** (or be the only +python block, as on `gpu.md`) — a conservative convention enforced by +`test_test_only_blocks_come_last` for the blocks we author this way; and (2) the +**authoritative** build-hazard check is `mkdocs build --strict` (the `docs:check` CI job), +which catches the `exec="false"` case too. The S3 example is placed at the end of +`quick-start.md` accordingly. ## Marker-bound execution diff --git a/tests/test_docs.py b/tests/test_docs.py index 31886386da..9561d0ad05 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -188,13 +188,25 @@ def test_no_unvalidated_blocks() -> None: def test_test_only_blocks_come_last() -> None: - """A test="true"-only block (one markdown-exec does not execute, because it lacks - exec="true") must not precede an exec="true" block in the same file. markdown-exec's - SuperFences validator rejects the unexecuted python fence, which disrupts its - build-time execution of any later exec="true" block on the page (observed: the - quickstart ZipStore example failed with FileNotFoundError, aborting `mkdocs build - --strict`). Enforcing the ordering here turns that build-only failure into a fast, - local unit failure.""" + """A conservative placement convention: a test="true"-only block must come after every + exec="true" block in the same file. + + Mechanism (established by experiment + markdown-exec's SuperFences integration): a + python fence that markdown-exec does not execute -- i.e. one lacking exec="true", + whether test="true" or exec="false" -- placed before an exec="true" block disrupts + markdown-exec's build-time execution of a *later, state-dependent* block. Observed: a + non-exec python fence inserted before the quickstart ZipStore write/read pair made the + read block fail with FileNotFoundError (the write never took effect), aborting + `mkdocs build --strict`. The effect needs a cross-block dependency to surface, so it + does not affect the standalone exec="true" blocks in e.g. data_types.md/performance.md + that already have exec="false" opt-out blocks above them. + + Because we cannot statically tell which later blocks are state-dependent, this guard + enforces the simple, safe convention only for the blocks we author this way + (test="true" marker-bound examples like s3/gpu). It is NOT a complete build-hazard + check -- the authoritative check is `mkdocs build --strict` (the docs:check CI job), + which catches the exec="false" case too. This guard just turns the common test-only + case into a fast, local failure.""" # Collect, per published-docs file, the start lines of test-only and exec blocks. test_only: defaultdict[str, list[int]] = defaultdict(list) exec_lines: defaultdict[str, list[int]] = defaultdict(list) @@ -220,8 +232,9 @@ def test_test_only_blocks_come_last() -> None: assert not offenders, ( 'A test="true"-only block must come after every exec="true" block in the same ' - "file (markdown-exec executes the later block at build time and a preceding " - "unexecuted python fence breaks it):\n" + "\n".join(offenders) + 'file: a non-executed python fence before an exec="true" block can disrupt ' + "markdown-exec's build-time execution of a later state-dependent block (see this " + "test's docstring):\n" + "\n".join(offenders) ) From ee82f5e34b1a72ebf3bf4d6e6a4833a7ff124fed Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 15:27:17 +0200 Subject: [PATCH 22/25] docs: remove design spec/plan caches from version control The spec and implementation plan under docs/superpowers/ were working artifacts, not published documentation (they were never in the mkdocs nav). The spec is preserved in a public gist; the plan is a local execution record. Remove both from the repo and drop the now-stale spec path from the test_docs.py module docstring. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../plans/2026-05-29-docs-block-validation.md | 749 ------------------ ...2026-05-29-docs-block-validation-design.md | 260 ------ tests/test_docs.py | 5 +- 3 files changed, 3 insertions(+), 1011 deletions(-) delete mode 100644 docs/superpowers/plans/2026-05-29-docs-block-validation.md delete mode 100644 docs/superpowers/specs/2026-05-29-docs-block-validation-design.md diff --git a/docs/superpowers/plans/2026-05-29-docs-block-validation.md b/docs/superpowers/plans/2026-05-29-docs-block-validation.md deleted file mode 100644 index 63ab0b696d..0000000000 --- a/docs/superpowers/plans/2026-05-29-docs-block-validation.md +++ /dev/null @@ -1,749 +0,0 @@ -# Docs Block Validation Implementation Plan - -> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. - -**Goal:** Make every python code block in `docs/` either execute (and thus get validated) or explicitly opt out with a documented reason, and add a guard test so a block can never again silently opt out of validation. - -**Architecture:** The doctests in `tests/test_docs.py` are already parametrized pytest tests. We (1) teach the parametrizer to read a `markers="..."` fence attribute and attach the matching pytest marker to each session's `pytest.param`, (2) add an `s3` marker bound to a `moto` mock-S3 fixture so the S3 example runs in the default doctest env, (3) reuse the existing `gpu` marker for the GPU block, (4) remediate the 12 currently-unexecuted blocks per-case, and (5) add a guard test asserting every docs python block is `exec="true"` or explicitly opted out with a reason. - -**Tech Stack:** pytest, pytest-examples, markdown-exec (mkdocs), moto[s3,server], s3fs, hatch envs (`doctest`, `gputest`). - -**Upstream:** Fixes [#4016](https://github.com/zarr-developers/zarr-python/issues/4016); implements the guard from [#4017](https://github.com/zarr-developers/zarr-python/issues/4017). Design spec: `docs/superpowers/specs/2026-05-29-docs-block-validation-design.md`. - ---- - -## File Structure - -- `tests/test_docs.py` — **modify.** Add `markers=` parsing in `group_examples_by_session()`, an `s3` fixture + marker-binding, and the new `test_no_unvalidated_blocks` guard test. -- `pyproject.toml` — **modify.** Register the `s3` marker in `[tool.pytest.ini_options] markers`. -- `docs/quick-start.md` — **modify.** S3 block: fix `mode="w"`, add `markers="s3"`, make it executable. -- `docs/user-guide/performance.md` — **modify.** Turn on the two config-only blocks; opt out (or fix) the dask block. -- `docs/user-guide/arrays.md` — **modify.** Turn on the config block. -- `docs/user-guide/cli.md` — **modify.** Make the `zarr.open` block runnable or opt it out. -- `docs/user-guide/gpu.md` — **modify.** Add `exec="true" markers="gpu"`. -- `docs/contributing.md` — **modify.** Fix `exec="on"` typo; opt out the pseudocode block. -- `docs/user-guide/data_types.md` — **modify.** Opt out the REPL-transcript block. -- `docs/user-guide/examples/custom_dtype.md` — **modify.** Opt out the `--8<--` include block. -- `docs/user-guide/v3_migration.md` — **modify.** Opt out the intentionally-wrong-import block. -- `changes/4016.bugfix.md` — **create.** Towncrier news fragment. - -### Opt-out convention (decided here, used throughout) - -A block that must not execute is tagged: - -```` -```python exec="false" reason="" -```` - -- `exec="false"` is an explicit, greppable opt-out that `markdown-exec` will **not** execute (only `exec="true"` triggers execution). -- `reason="..."` documents *why*. The guard test requires it on any non-`exec="true"` block. - ---- - -## Task 1: Spike — can the `s3` fixture provide a default endpoint with no `storage_options`? - -This is the load-bearing unknown. The existing S3 tests always pass `endpoint_url` explicitly via `client_kwargs`/`storage_options` (`tests/test_store/test_fsspec.py:109-116, 131`). The docs block must read clean — `zarr.create_array("s3://...")` with **no** `storage_options`. We must confirm a process-wide default endpoint works before writing the real fixture. - -**Files:** -- Test (scratch): `tests/test_docs_s3_spike.py` (deleted at end of task) - -- [ ] **Step 1: Write a scratch test that starts moto, sets a default endpoint via env, and creates an array with a bare `s3://` URL** - -```python -# tests/test_docs_s3_spike.py -import os - -import pytest - -moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") -pytest.importorskip("s3fs") -botocore = pytest.importorskip("botocore") -requests = pytest.importorskip("requests") - -PORT = 5556 # different from test_fsspec.py's 5555 to avoid collisions -ENDPOINT = f"http://127.0.0.1:{PORT}/" - - -def test_bare_s3_url_with_default_endpoint() -> None: - """A create_array('s3://...') call with no storage_options should reach a - moto server when the endpoint is configured process-wide (env var).""" - server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=PORT) - server.start() - try: - os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") - os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") - # Candidate mechanism A: aiobotocore/botocore honors AWS_ENDPOINT_URL - os.environ["AWS_ENDPOINT_URL"] = ENDPOINT - - # create the bucket via boto3 sync client - session = botocore.session.Session() - client = session.create_client("s3", endpoint_url=ENDPOINT, region_name="us-east-1") - client.create_bucket(Bucket="docs-bucket") - client.close() - - import s3fs - - import zarr - - s3fs.S3FileSystem.clear_instance_cache() - z = zarr.create_array( - "s3://docs-bucket/foo", shape=(8, 8), chunks=(4, 4), dtype="f4" - ) - z[:, :] = 1.0 - assert z[0, 0] == 1.0 - finally: - requests.post(f"{ENDPOINT}/moto-api/reset") - server.stop() -``` - -- [ ] **Step 2: Run the spike** - -Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v` (or `uv run pytest tests/test_docs_s3_spike.py -v` inside the doctest env) -Expected: **One of two outcomes** — record which: -- **PASS** → `AWS_ENDPOINT_URL` works as a process-wide default. Use env-var mechanism in Task 3. -- **FAIL** (connection refused / NoCredentials / hits real AWS) → env var insufficient. Try candidate B below. - -- [ ] **Step 3: If Step 2 failed, try fsspec default config** - -Replace the `AWS_ENDPOINT_URL` line with: - -```python - import fsspec - - fsspec.config.conf["s3"] = {"client_kwargs": {"endpoint_url": ENDPOINT}, "anon": False} -``` - -Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v` -Expected: PASS → use `fsspec.config.conf` mechanism in Task 3. - -- [ ] **Step 4: If both failed, record the fallback decision** - -If neither bare-URL mechanism works, the visible block will show `storage_options={"endpoint_url": ...}` honestly (spec fallback for spike #1). Note which mechanism (env var, fsspec config, or fallback) won, in the commit message — Task 3 depends on it. - -- [ ] **Step 5: Delete the scratch test and commit the finding** - -```bash -git rm tests/test_docs_s3_spike.py -git commit -m "test: spike s3 default-endpoint mechanism for docs (no storage_options) - -Result: " -``` - -### RESULT (completed 2026-05-29, commit 460385d) - -- **Mechanism A won:** `os.environ["AWS_ENDPOINT_URL"] = ENDPOINT` (+ dummy - `AWS_SECRET_ACCESS_KEY`/`AWS_ACCESS_KEY_ID`) makes a bare - `create_array("s3://...")` reach moto with no `storage_options`. `clear_instance_cache()` - alone sufficed; the `set_session()`/`skip_instance_cache` dance from `test_fsspec.py` - was not needed for a single fixture. No event-loop or teardown warnings observed. -- **PLAN CORRECTION (important):** the `doctest` hatch env does **NOT** install moto. - `moto[s3,server]` is only in the `remote-tests` dependency group; the `doctest` env - (`pyproject.toml` ~line 277-284) has only `s3fs` + `pytest-examples` as extras. **Task 3 - MUST add `moto[s3,server]` and `requests` to `[tool.hatch.envs.doctest] extra-dependencies`**, - or any moto-backed doctest will silently `importorskip`-skip — defeating the purpose. - Task 4's verification must assert the S3 case **runs**, not skips. - ---- - -## Task 2: Register the `s3` pytest marker - -**Files:** -- Modify: `pyproject.toml` (the `[tool.pytest.ini_options]` `markers` list, currently at lines 446-450) - -- [ ] **Step 1: Add the `s3` marker** - -In `pyproject.toml`, change the `markers` list from: - -```toml -markers = [ - "asyncio: mark test as asyncio test", - "gpu: mark a test as requiring CuPy and GPU", - "slow_hypothesis: slow hypothesis tests", -] -``` - -to: - -```toml -markers = [ - "asyncio: mark test as asyncio test", - "gpu: mark a test as requiring CuPy and GPU", - "s3: mark a test as requiring a (mock) S3 backend via moto", - "slow_hypothesis: slow hypothesis tests", -] -``` - -- [ ] **Step 2: Verify pytest accepts the marker (no unknown-marker warning)** - -Run: `hatch run doctest:test --markers | grep s3` -Expected: shows `@pytest.mark.s3: mark a test as requiring a (mock) S3 backend via moto` - -- [ ] **Step 3: Commit** - -```bash -git add pyproject.toml -git commit -m "test: register s3 pytest marker" -``` - ---- - -## Task 3: Teach `test_docs.py` to parse `markers=` and bind the `s3` fixture - -This task adds (a) `markers=` parsing so a session carries the right pytest marker, and (b) the moto-backed `s3` fixture using the mechanism chosen in Task 1. - -**Files:** -- Modify: `tests/test_docs.py` - -- [ ] **Step 1: Write a failing test that a markered session carries its marker** - -Add to `tests/test_docs.py`: - -```python -def test_markers_attribute_is_parsed(tmp_path: Path) -> None: - """A block tagged markers="s3" must surface that marker on its parametrized case, - so pytest can gate/bind it (e.g. attach the moto fixture).""" - md = tmp_path / "ex.md" - md.write_text( - '```python exec="true" session="demo" markers="s3"\n' - "import zarr\n" - "```\n", - encoding="utf-8", - ) - params = _session_params(md.parent) - assert len(params) == 1 - marks = params[0].marks - assert any(m.name == "s3" for m in marks) -``` - -(This references a new helper `_session_params(root)` that returns a list of `pytest.param(...)`; we extract the grouping logic into it in Step 3.) - -- [ ] **Step 2: Run it to confirm it fails** - -Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v` -Expected: FAIL with `AttributeError: module ... has no attribute '_session_params'` (or `NameError`). - -- [ ] **Step 3: Refactor grouping into `_session_params` that emits markers** - -Replace `group_examples_by_session()` (currently `tests/test_docs.py:39-64`) and the parametrize decorator (`tests/test_docs.py:72-75`) with a version that returns `pytest.param` objects carrying marks. Add near the top of the file: - -```python -def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]: - """Translate a block's markers="a b" attribute into pytest mark decorators.""" - raw = settings.get("markers", "") - return [getattr(pytest.mark, name) for name in raw.split() if name] - - -def _session_params(root: Path) -> list[pytest.param]: - """Group exec="true" examples by (file, session) and emit one pytest.param per - session, carrying the union of markers declared by that session's blocks.""" - sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list) - marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set) - - for example in find_examples(str(root)): - settings = example.prefix_settings() - if settings.get("exec") != "true": - continue - session_name = settings.get("session", "_default") - key = (str(example.path), session_name) - sessions[key].append(example) - for mark in _markers_for(settings): - marks_by_session[key].add(mark.name) - - params = [] - for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])): - marks = tuple(getattr(pytest.mark, name) for name in sorted(marks_by_session[key])) - params.append(pytest.param(key, marks=marks, id=name_example(key[0], key[1]))) - return params -``` - -Keep `name_example()` as-is. Add `CodeExample` to the existing pytest-examples import if not already imported (it is: `from pytest_examples import CodeExample, EvalExample, find_examples`). - -- [ ] **Step 4: Update the parametrized test to use `_session_params` and request the fixtures** - -Replace the decorator + signature of `test_documentation_examples` (`tests/test_docs.py:72-79`) with: - -```python -@pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT)) -def test_documentation_examples( - session_key: tuple[str, str], - eval_example: EvalExample, - request: pytest.FixtureRequest, -) -> None: -``` - -Inside the body, before running examples, activate the `s3` fixture when the case is s3-marked: - -```python - if request.node.get_closest_marker("s3") is not None: - request.getfixturevalue("docs_s3_backend") -``` - -(Leave the rest of the body — the `find_examples` loop and `eval_example.run(...)` — unchanged.) - -- [ ] **Step 5: Add the `docs_s3_backend` fixture** - -Add to `tests/test_docs.py` (using the mechanism Task 1 selected — shown here for the `AWS_ENDPOINT_URL` variant; swap to `fsspec.config` or the `storage_options` fallback per Task 1's result): - -```python -S3_PORT = 5556 -S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/" -S3_BUCKET = "example-bucket" - - -@pytest.fixture -def docs_s3_backend() -> Generator[None, None, None]: - """Stand up a moto mock-S3 server and configure a process-wide default endpoint - so docs blocks can use a bare s3:// URL with no storage_options.""" - moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") - s3fs = pytest.importorskip("s3fs") - botocore = pytest.importorskip("botocore") - requests = pytest.importorskip("requests") - - server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT) - server.start() - prev_endpoint = os.environ.get("AWS_ENDPOINT_URL") - os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") - os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") - os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT - - session = botocore.session.Session() - client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1") - client.create_bucket(Bucket=S3_BUCKET) - client.close() - s3fs.S3FileSystem.clear_instance_cache() - try: - yield - finally: - requests.post(f"{S3_ENDPOINT}/moto-api/reset") - if prev_endpoint is None: - os.environ.pop("AWS_ENDPOINT_URL", None) - else: - os.environ["AWS_ENDPOINT_URL"] = prev_endpoint - server.stop() -``` - -Add the required imports at the top of `tests/test_docs.py`: - -```python -import os -from collections.abc import Generator -``` - -- [ ] **Step 6: Run the marker-parsing test — it should now pass** - -Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v` -Expected: PASS - -- [ ] **Step 7: Run the full docs test to confirm no regression in existing sessions** - -Run: `hatch run doctest:test -v` -Expected: PASS for all existing `quickstart` etc. sessions (the S3 block isn't markered yet — that's Task 4). - -- [ ] **Step 8: Commit** - -```bash -git add tests/test_docs.py -git commit -m "test: parse markers= on docs blocks and add moto s3 fixture binding" -``` - ---- - -## Task 4: Fix and enable the S3 example (#4016) - -**Files:** -- Modify: `docs/quick-start.md:134-140` - -- [ ] **Step 1: Replace the bare, invalid S3 block** - -Replace lines 134-140 (the ```` ```python `` … ```` block containing `mode="w"`) with: - -````markdown -```python exec="true" session="s3demo" markers="s3" source="above" -import zarr -import numpy as np - -z = zarr.create_array( - "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4" -) -z[:, :] = np.random.random((100, 100)) -``` -```` - -Notes: -- `mode="w"` removed (the #4016 bug; `create_array` has no `mode` parameter — see `src/zarr/api/synchronous.py:799`). -- Unused `import s3fs` removed. -- `import numpy as np` added — this is a fresh `s3demo` session, so `np` is not in scope from the `quickstart` session. -- New session `s3demo` keeps the moto fixture scoped to just this block (the `quickstart` session must NOT become s3-marked). -- The displayed URL stays `s3://example-bucket/foo`; the moto endpoint is supplied by the `docs_s3_backend` fixture (bucket name `example-bucket` matches `S3_BUCKET` in Task 3). -- **If Task 1 chose the `storage_options` fallback:** add `storage_options={"endpoint_url": "..."}` to the visible call instead, and adjust the prose to explain it. - -- [ ] **Step 2: Run the S3 docs example against moto** - -Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[quick-start.md:s3demo]" -v` -Expected: PASS (executes against moto; no real-cloud contact). - -- [ ] **Step 3: Commit** - -```bash -git add docs/quick-start.md -git commit -m "docs: fix invalid s3 create_array example and run it against moto (#4016)" -``` - ---- - -## Task 5: Enable the config-only blocks - -These are plain `zarr.config.set(...)` calls that run as-is. Each gets its own self-contained session so config mutations don't bleed into other examples (config is process-global; reset is out of scope — separate sessions keep ids distinct but note config is not auto-restored, which is acceptable for these read-only-style demos). - -**Files:** -- Modify: `docs/user-guide/performance.md:207`, `docs/user-guide/performance.md:237` -- Modify: `docs/user-guide/arrays.md:622` - -- [ ] **Step 1: Enable `performance.md:207` (concurrency config)** - -Change the fence from ```` ```python ```` to: - -````markdown -```python exec="true" session="perf-concurrency" -```` - -(Body unchanged — `import zarr` + `zarr.config.set({'async.concurrency': 128})` + the commented env-var line, which is inert.) - -- [ ] **Step 2: Enable `performance.md:237` (max_workers config)** - -Change the fence to: - -````markdown -```python exec="true" session="perf-workers" -```` - -- [ ] **Step 3: Enable `arrays.md:622` (rectilinear_chunks config)** - -Change the fence to: - -````markdown -```python exec="true" session="arrays-rectilinear" -```` - -- [ ] **Step 4: Run the three sessions** - -Run: -```bash -hatch run doctest:test \ - "tests/test_docs.py::test_documentation_examples[performance.md:perf-concurrency]" \ - "tests/test_docs.py::test_documentation_examples[performance.md:perf-workers]" \ - "tests/test_docs.py::test_documentation_examples[arrays.md:arrays-rectilinear]" -v -``` -Expected: PASS (3 passed). - -- [ ] **Step 5: Commit** - -```bash -git add docs/user-guide/performance.md docs/user-guide/arrays.md -git commit -m "docs: execute config-setting examples in performance.md and arrays.md" -``` - ---- - -## Task 6: Make the CLI `zarr.open` block runnable - -`docs/user-guide/cli.md:48` opens `'path/to/input.zarr'` which doesn't exist. Rewrite it to create then open a real local array so it executes and still illustrates `zarr_format=3`. - -**Files:** -- Modify: `docs/user-guide/cli.md:46-51` - -- [ ] **Step 1: Replace the block** - -Replace the bare block with: - -````markdown -```python exec="true" session="cli-open" source="above" -import zarr - -# create a small array to open (stands in for the migrated store) -zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4") - -zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3) -``` -```` - -(Keep the surrounding prose; the example now demonstrates `open(..., zarr_format=3)` on a real store. The illustrative `'path/to/input.zarr'` filename was the only reason it couldn't run.) - -- [ ] **Step 2: Run it** - -Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[cli.md:cli-open]" -v` -Expected: PASS - -- [ ] **Step 3: Commit** - -```bash -git add docs/user-guide/cli.md -git commit -m "docs: make cli zarr.open example runnable against a local store" -``` - ---- - -## Task 7: Enable the GPU block (env-gated via `gpu` marker) - -**Files:** -- Modify: `docs/user-guide/gpu.md:19-28` - -- [ ] **Step 1: Tag the GPU block** - -Change the fence from ```` ```python ```` to: - -````markdown -```python exec="true" session="gpu-demo" markers="gpu" source="above" -```` - -(Body unchanged: `import cupy as cp`, `zarr.config.enable_gpu()`, `create_array("memory://gpu-demo", ...)`, etc.) - -> **PLAN CORRECTION (found during execution, commit 010d99a):** a registered marker -> does NOT auto-skip a test under plain `pytest` — markers only *filter* when you pass -> `-m`. Without a guard, the gpu block runs and FAILS with `ModuleNotFoundError: cupy` -> (cupy is darwin-excluded). The repo's real convention is `pytest.importorskip("cupy")` -> in the test body (cf. `tests/conftest.py:183`). So Task 7 also adds to -> `test_documentation_examples` (mirroring the `s3` binding): -> ```python -> if request.node.get_closest_marker("gpu") is not None: -> pytest.importorskip("cupy") -> ``` -> This converts the missing-cupy hard error into a proper SKIP in the default env, while -> `-m gpu` in the `gputest` env still collects+runs it on real hardware. - -- [ ] **Step 2: Confirm it is SKIPPED in the default doctest env (no GPU)** - -Run: `hatch -e doctest run pytest "tests/test_docs.py::test_documentation_examples[user-guide/gpu.md:gpu-demo]" -v` -Expected: SKIPPED (via `importorskip("cupy")`), **not** an error, **not** absent. - -- [ ] **Step 3: Confirm it is COLLECTED for the gpu selection** - -Run: `hatch run doctest:test -m gpu --co -q | grep gpu-demo` -Expected: the `gpu.md:gpu-demo` case is collected (it will actually execute only on real GPU hardware in the `gputest` env, which we can't run here). - -- [ ] **Step 4: Commit** - -```bash -git add docs/user-guide/gpu.md -git commit -m "docs: execute gpu example under the gpu marker" -``` - ---- - -## Task 8: Fix the `exec="on"` typo and opt out the genuinely-non-executable blocks - -**Files:** -- Modify: `docs/contributing.md:15` (pseudocode) and `docs/contributing.md:231` (`exec="on"` typo) -- Modify: `docs/user-guide/data_types.md:363` (REPL transcript) -- Modify: `docs/user-guide/examples/custom_dtype.md:5` (`--8<--` include) -- Modify: `docs/user-guide/v3_migration.md:42` (intentionally-wrong import) - -- [ ] **Step 1: Fix the `exec="on"` typo in `contributing.md:231`** - -Change the fence attribute `exec="on"` to `exec="true"`. Then run that block to confirm it actually executes cleanly: - -Run: `hatch run doctest:test -v -k contributing` -Expected: the formerly-`exec="on"` block now runs. **If it fails** (the code was broken too, having never run), fix the code in the block minimally so it passes, or — if it's not meant to run — convert it to `exec="false" reason="..."`. Record which in the commit. - -- [ ] **Step 2: Opt out `contributing.md:15` (pseudocode)** - -Change ```` ```python ```` to: - -````markdown -```python exec="false" reason="illustrative pseudocode with a '# etc.' placeholder, not runnable" -```` - -- [ ] **Step 3: Opt out `data_types.md:363` (REPL transcript)** - -Change ```` ```python ```` to: - -````markdown -```python exec="false" reason="REPL output transcript, not executable source" -```` - -- [ ] **Step 4: Opt out `custom_dtype.md:5` (`--8<--` include)** - -Change ```` ```python ```` to: - -````markdown -```python exec="false" reason="pymdownx snippet include directive, not python source" -```` - -- [ ] **Step 5: Opt out `v3_migration.md:42` (intentionally-wrong import)** - -Change ```` ```python ```` to: - -````markdown -```python exec="false" reason="intentionally shows the old/incorrect import for contrast" -```` - -- [ ] **Step 6: Commit** - -```bash -git add docs/contributing.md docs/user-guide/data_types.md docs/user-guide/examples/custom_dtype.md docs/user-guide/v3_migration.md -git commit -m "docs: fix exec=on typo and explicitly opt out non-runnable blocks" -``` - ---- - -## Task 9: Handle the dask block in performance.md - -`docs/user-guide/performance.md:263` uses `dask.array` and opens `'data/large_array.zarr'` (nonexistent). Two viable dispositions — pick based on whether `dask` is in the doctest env. - -**Files:** -- Modify: `docs/user-guide/performance.md:263-280` - -- [ ] **Step 1: Check whether dask is available in the doctest env** - -Run: `hatch run doctest:list-env | grep -i dask` -Expected: either shows a `dask` line (available) or nothing (not available). - -- [ ] **Step 2a: If dask IS available — make it runnable** - -Replace the `'data/large_array.zarr'` open with a created array, keeping the dask demonstration: - -````markdown -```python exec="true" session="perf-dask" source="above" -import zarr -import dask.array as da - -zarr.config.set({ - 'async.concurrency': 4, - 'threading.max_workers': 4, -}) - -# create a small array to read with Dask -zarr.create_array("data/perf-dask-demo.zarr", shape=(16, 16), chunks=(8, 8), dtype="f4") -z = zarr.open_array("data/perf-dask-demo.zarr", mode="r") - -arr = da.from_array(z, chunks=z.chunks) -result = arr.mean(axis=0).compute() -``` -```` - -Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[performance.md:perf-dask]" -v` -Expected: PASS - -- [ ] **Step 2b: If dask is NOT available — opt out with a reason** - -Change ```` ```python ```` to: - -````markdown -```python exec="false" reason="requires dask, which is not in the docs test environment" -```` - -- [ ] **Step 3: Commit** - -```bash -git add docs/user-guide/performance.md -git commit -m "docs: make dask performance example runnable (or opt out if dask absent)" -``` - ---- - -## Task 10: Add the guard test - -The guard asserts every python block in `docs/` is either `exec="true"` or `exec="false"` with a non-empty `reason`. Anything else (bare, `exec="on"`, missing reason) fails. - -**Files:** -- Modify: `tests/test_docs.py` - -- [ ] **Step 1: Write the guard test** - -Add to `tests/test_docs.py`: - -```python -def test_no_unvalidated_blocks() -> None: - """Every python code block in docs/ must declare its validation state: - either exec="true" (it is executed as a test) or exec="false" with a reason - (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on") - fails here, so a block can never silently opt out of validation — the gap - that hid the invalid create_array(mode="w") example in #4016.""" - offenders: list[str] = [] - for example in find_examples(str(DOCS_ROOT)): - settings = example.prefix_settings() - exec_val = settings.get("exec") - loc = f"{Path(example.path).relative_to(DOCS_ROOT)}:{example.start_line}" - if exec_val == "true": - continue - if exec_val == "false" and settings.get("reason", "").strip(): - continue - offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})") - - assert not offenders, ( - "Docs python blocks must be exec=\"true\" or exec=\"false\" with a reason:\n" - + "\n".join(offenders) - ) -``` - -(`find_examples` from pytest-examples only yields fenced code blocks for languages it recognizes as runnable, which includes python; confirm in Step 2 that the count matches the audit. If it also yields non-python fences, filter on `example.prefix` / language — adjust to `if not str(example.path).endswith(".md"): continue` is unnecessary since DOCS_ROOT is all markdown.) - -- [ ] **Step 2: Run the guard — it must PASS now that all blocks are remediated** - -Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v` -Expected: PASS (zero offenders). **If it lists offenders**, they are blocks missed by Tasks 4-9 — fix each (turn on or opt out) until the list is empty. - -- [ ] **Step 3: Negative check — confirm the guard actually catches a bare block** - -Temporarily add a bare block to any docs file: - -````markdown -```python -1 / 0 -``` -```` - -Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v` -Expected: FAIL, listing the new bare block's location. - -Then remove the temporary block and re-run: -Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v` -Expected: PASS - -- [ ] **Step 4: Commit** - -```bash -git add tests/test_docs.py -git commit -m "test: guard that every docs python block is executed or opted out (#4017)" -``` - ---- - -## Task 11: Full suite + news fragment - -**Files:** -- Create: `changes/4016.bugfix.md` - -- [ ] **Step 1: Run the entire docs test suite** - -Run: `hatch run doctest:test -v` -Expected: PASS — all `exec="true"` sessions run (S3 against moto; config/cli/dask as applicable), the GPU session reports SKIPPED, and the guard passes. - -- [ ] **Step 2: Add the towncrier news fragment** - -Create `changes/4016.bugfix.md`: - -```markdown -Fixed an invalid ``zarr.create_array`` example in the quick-start docs (it passed an unsupported ``mode`` argument) and made the cloud-storage example execute against a mock S3 backend in CI. Added a test ensuring every python code block in the docs is either executed or explicitly opted out with a documented reason. -``` - -- [ ] **Step 3: Run the full prek/lint pass** - -Run: `prek run --all-files` -Expected: PASS (ruff, mypy, towncrier-check, etc. all green). - -- [ ] **Step 4: Commit** - -```bash -git add changes/4016.bugfix.md -git commit -m "docs: add news fragment for docs-block validation (#4016, #4017)" -``` - ---- - -## Self-review notes (resolved during planning) - -- **Spec coverage:** Part A (remediate 12 blocks) → Tasks 4-9; Part B (guard) → Task 10. Marker-bound execution (s3 + gpu) → Tasks 2, 3, 4, 7. Spike #1 → Task 1. `pyproject.toml` s3 marker → Task 2. All three spec spikes are addressed: #1 in Task 1; #2 (markdown-exec tolerance of `markers=`) is implicitly verified by `hatch run docs:build` — **add a build check**: see Task 11 Step 1 note below; #3 (moto teardown) handled by the fixture's `finally` block in Task 3 Step 5. -- **Spike #2 verification:** `markers=` and `reason=`/`exec="false"` are unknown attributes to markdown-exec; it ignores unrecognized prefix settings and only acts on `exec="true"`. Confirm by running `hatch run docs:build` once after Task 11 and checking it succeeds and that the gpu/s3 blocks render as static source. If the build errors on unknown attributes, fall back to the per-session marker map (spec fallback for spike #2). -- **The 12 blocks, accounted for:** quick-start S3 (T4), perf×2 config (T5), arrays config (T5), cli (T6), gpu (T7), contributing exec=on typo + pseudocode (T8), data_types transcript (T8), custom_dtype include (T8), v3_migration wrong-import (T8), perf dask (T9). = 12. ✓ -- **Naming consistency:** `_session_params`, `_markers_for`, `docs_s3_backend`, `test_no_unvalidated_blocks`, `S3_BUCKET="example-bucket"` (matches the URL in the T4 block) used consistently across tasks. diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md deleted file mode 100644 index cbbe689994..0000000000 --- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md +++ /dev/null @@ -1,260 +0,0 @@ -# Design: Close the "silently-unexecuted docs block" gap - -**Date:** 2026-05-29 -**Issue:** [zarr-developers/zarr-python#4016](https://github.com/zarr-developers/zarr-python/issues/4016) - -## Problem & root cause - -Issue #4016 reports invalid code in the docs: - -```python -z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10), dtype="f4") -``` - -`create_array` has no `mode` parameter, so this raises `TypeError: unexpected keyword -argument 'mode'`. The code was wrong because **nothing validated it**: it is a bare -` ```python ` block, and both the renderer (`markdown-exec`) and the test suite -(`tests/test_docs.py`, which filters on `settings.get("exec") != "true"`) only act on -blocks tagged `exec="true"`. Omitting that attribute is a *silent* opt-out from all -validation. - -This is not a one-off. An audit of all docs found **12 of 180** python blocks -unexecuted, including a second instance of the same failure mode: -`docs/contributing.md:231` is tagged `exec="on"` (a typo for `"true"`), so a block -meant to run silently does not. - -**Root cause:** validation is opt-in via an easily-mistyped, easily-omitted attribute, -with no signal when a block opts out. - -### Audit of the 12 bare blocks - -| Block | Why bare | Disposition | -|---|---|---| -| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute, `markers="s3"`** (moto infra, default doctest env) | -| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, `markers="gpu"`** (runs in `gputest` env) | -| `docs/user-guide/performance.md:207` | left bare | **`exec="true"`** (plain `zarr.config.set`) | -| `docs/user-guide/performance.md:237` | left bare | **`exec="true"`** | -| `docs/user-guide/performance.md:263` | left bare | **`exec="true"`** (uses dask + a local array path) | -| `docs/user-guide/arrays.md:622` | left bare | **`exec="true"`** (`zarr.config.set`) | -| `docs/user-guide/cli.md:48` | left bare | **`exec="true"`** (`zarr.open`; needs a runnable path/store) | -| `docs/contributing.md:231` | **`exec="on"` typo** | **Fix typo** → `exec="true"` | -| `docs/contributing.md:15` | pseudocode (`# etc.`) | **Explicit opt-out** + reason | -| `docs/user-guide/data_types.md:363` | REPL transcript (``) | **Explicit opt-out** + reason | -| `docs/user-guide/examples/custom_dtype.md:5` | `--8<--` file include | **Explicit opt-out** + reason | -| `docs/user-guide/v3_migration.md:42` | intentionally-wrong import | **Explicit opt-out** + reason | - -(`performance.md:263` and `cli.md:48` need a small adjustment — a memory store or a -real local path — to be runnable; confirm during implementation.) - -## Approach - -Two complementary parts. - -### Part A — Per-case remediation of the 12 bare blocks - -Not one mechanism — a triage. Each block gets the treatment that fits *why* it is not -executing: - -- **Make executable against fakes** — the S3 example, via `markers="s3"`. The marker - binds the block to the repo's existing `moto` mock-S3 infra (pattern from - `tests/test_store/test_fsspec.py`) so it runs for real in CI with no real-cloud - contact. Execution validates the whole write path, not just the signature; `mode="w"` - dies by construction. See "Marker-bound execution". -- **Just turn on** — the config/open blocks (`performance.md` ×3, `arrays.md:622`, - `cli.md:48`) are plain runnable API calls; flip them to `exec="true"`. -- **Fix the typo** — `contributing.md:231` `exec="on"` → `exec="true"`. -- **Execute, env-gated** — the GPU block, via `markers="gpu"`. It *can* run, but only in - the `gputest` env (cupy + GPU hardware), not the default `doctest` env. See - "Marker-bound execution". -- **Explicit opt-out** — blocks that genuinely cannot run anywhere and are not - executable Python: REPL transcript, `--8<--` include, intentionally-wrong import, - pseudocode. These get a *documented, greppable* opt-out marker carrying a reason. - -### Part B — A guard test - -So the gap cannot silently reopen: every python block in `docs/` must either be -`exec="true"` *or* carry the explicit opt-out marker with a reason. A bare or -mistyped block fails the guard. This would have caught both `mode="w"` and the -`exec="on"` typo. - -### Dropped from scope - -The type-checking / markdown-extractor machinery considered earlier. Execution-against- -fakes strictly dominates type-checking for the cloud case (and the untyped `s3fs`/`cupy` -imports make strict type-checking least clean exactly where it was wanted most), and the -guard handles everything else. Proportionate to ~7 genuinely-affected blocks. - -## Key insight: doctests are already pytest tests - -`tests/test_docs.py::test_documentation_examples` is an ordinary `@pytest.mark.parametrize`d -pytest test — one case per `(file, session)`. It is not a separate doctest mechanism. -Therefore everything pytest already provides for gating tests (markers, `-m` selection, -skips) is available; the design uses it rather than inventing harness concepts. - -There are two distinct executors of docs blocks, and conflating them is what made -marker-bound execution look hard: - -- **`markdown-exec` at docs-build time** — runs blocks to render output into the - published site. Build runners have no cupy (and the S3 setup is test infra), so a - marker-bound block must render as static source here (no build-time execution). -- **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is - where markers live, where infra fixtures bind, and where env-gating happens. - -## Two flags: `exec` (render output) vs `test` (validate) - -A code block can be *run* for two unrelated reasons, and conflating them breaks the -build. They are separate fence attributes: - -- **`exec="true"`** — markdown-exec executes the block **at docs-build time to render its - output** into the published page. This is markdown-exec's own attribute (it hard-codes - the name `exec`, see `markdown_exec/_internal/main.py`), so we cannot rename it. Read it - as *"execute to render output."* -- **`test="true"`** — **our** `tests/test_docs.py` harness executes the block **as a - validation test**. markdown-exec does not recognize `test=` and ignores it. - -Why two: a block that needs special infra to run (GPU/cupy, or S3) must be **validated in -tests** but must **not run at build** — build runners have no GPU and no moto server, so -an `exec="true"` GPU block makes `mkdocs build --strict` abort (`ModuleNotFoundError: -cupy`). Separating the flags lets such a block be `test="true"` (tested) without -`exec="true"` (so it renders as static source at build, never executed there). - -**Harness rule:** a block is collected as a test if `exec="true"` **OR** `test="true"`. -So existing `exec="true"` example blocks stay tested as before (backward-compatible), and -test-only blocks add `test="true"` without `exec`. - -**The combinations:** - -| Block | `exec` | `test` | Effect | -|---|---|---|---| -| Tutorial examples (quickstart, config, …) | `true` | — | Run at build (render output); also tested. | -| GPU / S3 examples | — | `true` | Tested (under markers); rendered static at build. | -| Non-runnable (transcript, include, wrong-import) | `false`+`reason` | — | Neither; explicit reasoned opt-out. | - -**Placement constraint (markdown-exec quirk).** markdown-exec registers a SuperFences -custom fence for `python`; its validator *rejects* any fence lacking `exec="true"` -(`exec="false"` and `test="true"` alike — both are "not executed at build"). Established by -experiment: a rejected python fence positioned **before** an `exec="true"` block disrupts -markdown-exec's build-time execution of a **later, state-dependent** block (regardless of -session). Observed concretely: any non-exec python fence inserted before the quickstart -`ZipStore` write/read pair made the read block fail (`FileNotFoundError` — the write never -took effect) and `mkdocs build --strict` aborted. The effect only surfaces with a -cross-block dependency, so it does **not** affect the standalone `exec="true"` blocks in -`data_types.md`/`performance.md` that already carry `exec="false"` opt-out blocks above -them. - -Because we cannot statically tell which later blocks are state-dependent, the response is -twofold: (1) **a `test="true"`-only block must come last on its page** (or be the only -python block, as on `gpu.md`) — a conservative convention enforced by -`test_test_only_blocks_come_last` for the blocks we author this way; and (2) the -**authoritative** build-hazard check is `mkdocs build --strict` (the `docs:check` CI job), -which catches the `exec="false"` case too. The S3 example is placed at the end of -`quick-start.md` accordingly. - -## Marker-bound execution - -A block declares the pytest marker it needs via a **fence attribute**. Marker-bound -blocks are `test="true"` (validated) but **not** `exec="true"` (not build-run), e.g.: - -```` -```python test="true" markers="gpu" source="above" -```python test="true" markers="s3" source="above" -```` - -`group_examples_by_session()` parses `markers=` and emits -`pytest.param(session_key, marks=pytest.mark.)`. The marker then **binds the case to -whatever that marker means** — and the two markers mean different things, which is the -point of unifying the model rather than special-casing each: - -- **`gpu` — env-gate.** A registered marker does **not** auto-skip under plain `pytest` - (markers only *filter* when you pass `-m`). The repo's convention is - `pytest.importorskip("cupy")` in the test body (cf. `tests/conftest.py`), so the harness - calls `importorskip("cupy")` for gpu-marked docs cases: in the default `doctest` env the - case is **skipped** (no cupy), and `pytest -m gpu` in the `gputest` env runs it on real - cupy. The block is `test="true"` (not `exec="true"`), so it is never run at build. - -- **`s3` — infra-binding.** A new `s3` marker (must be registered in the `markers` - table). An autouse-style fixture keyed on the marker stands up the `moto` server and - registers a default endpoint, so an `s3`-marked docs case runs against the fake S3 - with no real-cloud contact. Because the infra is just pip deps already present in the - `doctest` env (`s3fs`, `moto[s3,server]`), the case **runs in the default doctest - run** — the marker binds infra, it does not gate the case out. The moto/endpoint - plumbing lives in named pytest fixtures, not a hidden markdown setup block. - -Both blocks therefore follow one rule: *declare the marker; the harness binds the marker -to the infra/env it needs.* The asymmetry is in what each marker resolves to (gpu → -hardware env, s3 → fixture), not in the declaration mechanism. - -## Components & data flow - -**`docs/` markdown** — source of truth. Each python block is in one of three declared -states (see the two-flags table above): - -1. `exec="true"` and/or `test="true"` (optionally `+ markers=""`) — validated, by - build-render and/or by the test harness. -2. `exec="false"` with a `reason="..."` — explicit, documented opt-out. -3. anything else (bare, `exec="on"`, …) — **illegal**, fails the guard. - -The opt-out form is `exec="false" reason="..."`: explicit, greppable, carries a -human-readable reason, and is not executed by markdown-exec at build time. - -**`tests/test_docs.py`** — already-parametrized pytest harness. Changes: - -- `group_examples_by_session()` parses the `markers=` attribute and emits - `pytest.param(..., marks=pytest.mark.)` so marker-binding rides existing marker - machinery. -- A marker-keyed fixture for `s3` that stands up the `moto` server and registers a - default endpoint (pattern lifted from `tests/test_store/test_fsspec.py`), applied to - `s3`-marked docs cases. -- New guard test `test_no_unvalidated_blocks` — walks every python block in `docs/`, - asserts each is `exec="true"` or carries the explicit opt-out marker. Fails on - bare/typo'd blocks. - -**`pyproject.toml`** — register the new `s3` marker in the `markers` table (alongside -`gpu`). - -**`docs/quick-start.md` S3 block** — gains `markers="s3"`. The visible code stays a clean -`create_array("s3://...")`; the moto server and default-endpoint registration are -supplied by the `s3` fixture, not by an in-markdown setup block. - -## Risks & spikes (resolve during implementation; do not guess) - -1. **Default S3 endpoint without `storage_options`.** Existing tests always pass - `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm the `s3` fixture can register - a *process-wide* default endpoint (via `fsspec.config` or `AWS_ENDPOINT_URL`) so the - visible `create_array("s3://...")` works clean with no `storage_options`. **Fallback:** - show the honest `storage_options={"endpoint_url": ...}` form in the visible block. - -2. **`markdown-exec` + unknown `markers=` attribute.** Confirm the build-time renderer - ignores `markers=` (or is told to), and that marker-bound blocks render as static - source in the published site (render source only, no build-time execution — the build - has neither cupy nor the moto fixture). **Fallback:** a per-session marker map in - `test_docs.py`, keeping markdown untouched. - -3. **moto teardown / loop affinity in the docs session.** `s3fs`/`aiobotocore` finalizers - are noisy at teardown and s3fs instances bind to the event loop they were created on - (see the filterwarnings note in `pyproject.toml` and the loop comments in - `test_fsspec.py`). Ensure the docs `s3` fixture starts/stops moto cleanly and does not - leak across sessions/tests. - -## Testing the change - -- Guard test is self-validating: after remediation, the full docs suite passes with zero - bare/typo'd blocks. -- Negative check: temporarily introduce a bare block, confirm the guard fails, remove it. -- S3 block: `hatch run doctest:test` runs it green against moto in the default doctest - env (the `s3` marker binds the fixture; it is not gated out). -- GPU block: `pytest -m gpu` in `gputest` executes it; the default `doctest` run reports - it **skipped**, not absent. - -## Out of scope - -- Type-checking machinery / markdown extractor. -- The 168 already-executing blocks. -- Broad docs rewrites beyond the 12 bare blocks. - -## Upstream - -[zarr-developers/zarr-python#4017](https://github.com/zarr-developers/zarr-python/issues/4017) -captures the root-cause framing (silent opt-out hides bugs; `mode="w"` and `exec="on"` as -two instances) and the Part B guard proposal for community discussion, independent of the -immediate fix in #4016. diff --git a/tests/test_docs.py b/tests/test_docs.py index 9561d0ad05..0638694b15 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -3,8 +3,9 @@ This module uses pytest-examples to validate Python code examples in the docs. A block is validated if it renders output at build (exec="true") or is explicitly marked for testing -(test="true"); see the two-flags discussion in -docs/superpowers/specs/2026-05-29-docs-block-validation-design.md. The test_no_unvalidated_blocks +(test="true"). The two flags are separate on purpose: exec= drives markdown-exec's +build-time rendering, while test= lets a block be validated without being run at build +(e.g. gpu/s3 examples the build environment cannot run). The test_no_unvalidated_blocks guard ensures every python block declares one of those, or an explicit exec="false" opt-out with a reason, so a block can never silently skip validation. """ From 84ad5a213640db0457479d5a260686a07a9c00aa Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 15:31:46 +0200 Subject: [PATCH 23/25] test: drop superpowers-docs exclusion now that those files are gone MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The _is_published_docs helper existed only to skip docs/superpowers/ design-doc caches; those files are no longer in the repo, so the helper and its three call sites were dead code referencing a nonexistent directory. Remove them — every find_examples(DOCS_ROOT) result is now published docs. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_docs.py | 19 ------------------- 1 file changed, 19 deletions(-) diff --git a/tests/test_docs.py b/tests/test_docs.py index 0638694b15..803d925b4a 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -55,19 +55,6 @@ def _is_tested(settings: dict[str, str]) -> bool: return settings.get("exec") == "true" or settings.get("test") == "true" -def _is_published_docs(path: str) -> bool: - """Whether a code example belongs to the published documentation. docs/superpowers/ - holds design-doc caches (plans/specs) that are not in the mkdocs nav; both the test - collector and the guard exclude it so they agree on what counts as real docs.""" - try: - rel = Path(path).relative_to(DOCS_ROOT) - except ValueError: - # Path is outside DOCS_ROOT (e.g. a tmp_path fixture in unit tests); treat it as - # in-scope so such tests exercise the normal path. - return True - return not (rel.parts and rel.parts[0] == "superpowers") - - def _session_params(root: Path) -> list[Any]: """Group tested examples (exec="true" or test="true") by (file, session) and emit one pytest.param per session, carrying the union of markers declared by that session's @@ -76,8 +63,6 @@ def _session_params(root: Path) -> list[Any]: marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set) for example in find_examples(str(root)): - if not _is_published_docs(str(example.path)): - continue settings = example.prefix_settings() if not _is_tested(settings): continue @@ -164,8 +149,6 @@ def test_no_unvalidated_blocks() -> None: A separate placement constraint is enforced by test_test_only_blocks_come_last.""" offenders: list[str] = [] for example in find_examples(str(DOCS_ROOT)): - if not _is_published_docs(str(example.path)): - continue rel = Path(example.path).relative_to(DOCS_ROOT) settings = example.prefix_settings() exec_val = settings.get("exec") @@ -212,8 +195,6 @@ def test_test_only_blocks_come_last() -> None: test_only: defaultdict[str, list[int]] = defaultdict(list) exec_lines: defaultdict[str, list[int]] = defaultdict(list) for example in find_examples(str(DOCS_ROOT)): - if not _is_published_docs(str(example.path)): - continue settings = example.prefix_settings() path = str(example.path) if settings.get("exec") == "true": From cfa792fc1a4b7a2698ad363d519e5792f44b3b5f Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 15:46:34 +0200 Subject: [PATCH 24/25] test: share one moto S3 backend across fsspec and docs tests Both test_store/test_fsspec.py and test_docs.py stood up their own moto ThreadedMotoServer (ports 5555 and 5556). Extract a single session-scoped `moto_server` fixture + MOTO_ENDPOINT_URL constant into tests/conftest.py and have both consumers reuse it: - test_fsspec.py: s3_base now returns the shared moto_server; its per-test `s3` fixture (bucket "test", explicit endpoint, event-loop cleanup) is unchanged. - test_docs.py: docs_s3_backend depends on moto_server and adds only its docs-specific layer (process-wide AWS_ENDPOINT_URL + "example-bucket"); it no longer owns the server lifecycle. One server now serves the whole session; each consumer creates and moto-api-resets its own bucket. Verified: test_fsspec.py (96 passed), the docs suite (58 passed), both together in one session (154 passed), and the full standard suite in the optional env (5956 passed); prek clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/conftest.py | 28 ++++++++++++++++++ tests/test_docs.py | 51 ++++++++++++++------------------- tests/test_store/test_fsspec.py | 30 +++++++------------ 3 files changed, 60 insertions(+), 49 deletions(-) diff --git a/tests/conftest.py b/tests/conftest.py index 3515acace0..3402eb7063 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -531,3 +531,31 @@ def deep_nan_equal(a: object, b: object) -> bool: if isinstance(a, Sequence) and isinstance(b, Sequence): return all(deep_nan_equal(a[i], b[i]) for i in range(len(a))) return nan_equal(a, b) + + +# Shared mock-S3 (moto) backend. A single server is reused across the whole test session by +# every test that needs S3 -- both the fsspec store tests and the documentation examples -- +# instead of each module standing up its own. Consumers create their own buckets and choose +# how the endpoint reaches the client (explicit storage_options vs. the AWS_ENDPOINT_URL +# env var) on top of this fixture. +MOTO_SERVER_PORT = 5555 +MOTO_ENDPOINT_URL = f"http://127.0.0.1:{MOTO_SERVER_PORT}/" + + +@pytest.fixture(scope="session") +def moto_server() -> Generator[str, None, None]: + """Start a session-scoped moto S3 server and yield its endpoint URL. + + importorskip lives inside the fixture so moto is only required when a test actually + requests an S3 backend, not for the whole test session.""" + moto_server_mod = pytest.importorskip("moto.moto_server.threaded_moto_server") + + server = moto_server_mod.ThreadedMotoServer(ip_address="127.0.0.1", port=MOTO_SERVER_PORT) + server.start() + # moto needs *some* credentials present; use throwaway values if the environment has none. + os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") + os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") + try: + yield MOTO_ENDPOINT_URL + finally: + server.stop() diff --git a/tests/test_docs.py b/tests/test_docs.py index 803d925b4a..02dca225b0 100644 --- a/tests/test_docs.py +++ b/tests/test_docs.py @@ -78,49 +78,40 @@ def _session_params(root: Path) -> list[Any]: return params -S3_PORT = 5556 -S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/" S3_BUCKET = "example-bucket" @pytest.fixture -def docs_s3_backend() -> Generator[None, None, None]: - """Stand up a moto mock-S3 server and set a process-wide default endpoint so docs - blocks can use a bare s3:// URL with no storage_options (see spike in plan Task 1).""" - moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") +def docs_s3_backend(moto_server: str) -> Generator[None, None, None]: + """Point docs S3 examples at the shared moto server (tests/conftest.py) via a + process-wide AWS_ENDPOINT_URL, so a block can use a bare s3:// URL with no + storage_options (see spike in the design notes). The server lifecycle belongs to the + session-scoped `moto_server` fixture; this fixture only adds the docs-specific + endpoint env var and a fresh bucket, and restores both on teardown.""" s3fs = pytest.importorskip("s3fs") botocore = pytest.importorskip("botocore") requests = pytest.importorskip("requests") - # Save every env var we mutate so teardown can restore the prior process state. - env_keys = ("AWS_ENDPOINT_URL", "AWS_SECRET_ACCESS_KEY", "AWS_ACCESS_KEY_ID") - prev_env = {key: os.environ.get(key) for key in env_keys} - server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT) - server.start() + prev_endpoint = os.environ.get("AWS_ENDPOINT_URL") + os.environ["AWS_ENDPOINT_URL"] = moto_server + + session = botocore.session.Session() + client = session.create_client("s3", endpoint_url=moto_server, region_name="us-east-1") + client.create_bucket(Bucket=S3_BUCKET) + client.close() + s3fs.S3FileSystem.clear_instance_cache() try: - os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo") - os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo") - os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT - - session = botocore.session.Session() - client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1") - client.create_bucket(Bucket=S3_BUCKET) - client.close() - s3fs.S3FileSystem.clear_instance_cache() yield finally: - # Cleanup must always run, even if the moto-api reset POST fails: stopping the - # server frees the fixed port and restoring the env avoids leaking state (and a - # stale AWS_ENDPOINT_URL) into the rest of the session. + # Reset moto state and restore AWS_ENDPOINT_URL; the shared server keeps running + # (the moto_server fixture stops it at session end). try: - requests.post(f"{S3_ENDPOINT}moto-api/reset") + requests.post(f"{moto_server}moto-api/reset") finally: - for key, value in prev_env.items(): - if value is None: - os.environ.pop(key, None) - else: - os.environ[key] = value - server.stop() + if prev_endpoint is None: + os.environ.pop("AWS_ENDPOINT_URL", None) + else: + os.environ["AWS_ENDPOINT_URL"] = prev_endpoint def test_markers_attribute_is_parsed(tmp_path: Path) -> None: diff --git a/tests/test_store/test_fsspec.py b/tests/test_store/test_fsspec.py index 142cb3b00d..afda534e49 100644 --- a/tests/test_store/test_fsspec.py +++ b/tests/test_store/test_fsspec.py @@ -1,7 +1,6 @@ from __future__ import annotations import json -import os import re from typing import TYPE_CHECKING, Any @@ -10,6 +9,7 @@ from packaging.version import parse as parse_version import zarr.api.asynchronous +from tests.conftest import MOTO_ENDPOINT_URL from zarr import Array from zarr.abc.store import OffsetByteRequest from zarr.core.buffer import Buffer, cpu, default_buffer_prototype @@ -50,31 +50,23 @@ fsspec = pytest.importorskip("fsspec") s3fs = pytest.importorskip("s3fs") requests = pytest.importorskip("requests") -moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server") -moto = pytest.importorskip("moto") +# Skip this module entirely when moto is absent; the server itself comes from the shared +# `moto_server` fixture in tests/conftest.py. +pytest.importorskip("moto") botocore = pytest.importorskip("botocore") # ### amended from s3fs ### # test_bucket_name = "test" secure_bucket_name = "test-secure" -port = 5555 -endpoint_url = f"http://127.0.0.1:{port}/" +# The moto server itself is the session-scoped `moto_server` fixture in tests/conftest.py; +# this module reuses its endpoint rather than standing up its own server. +endpoint_url = MOTO_ENDPOINT_URL -@pytest.fixture(scope="module") -def s3_base() -> Generator[None, None, None]: - # writable local S3 system - - # This fixture is module-scoped, meaning that we can reuse the MotoServer across all tests - server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=port) - server.start() - if "AWS_SECRET_ACCESS_KEY" not in os.environ: - os.environ["AWS_SECRET_ACCESS_KEY"] = "foo" - if "AWS_ACCESS_KEY_ID" not in os.environ: - os.environ["AWS_ACCESS_KEY_ID"] = "foo" - - yield - server.stop() +@pytest.fixture +def s3_base(moto_server: str) -> str: + """Reuse the shared session-scoped moto server (see tests/conftest.py).""" + return moto_server def get_boto3_client() -> botocore.client.BaseClient: From f6c587a701d4572fdf03b6588a0f81307c7c8af7 Mon Sep 17 00:00:00 2001 From: Davis Vann Bennett Date: Fri, 29 May 2026 15:58:25 +0200 Subject: [PATCH 25/25] docs: document the exec vs test code-block distinction for contributors The contributing guide explained exec="true" but said nothing about test="true", the exec="false"+reason opt-out, the guard that requires one of them, or the placement constraint -- so a contributor could write a bare block and hit the guard with no explanation. Add a "Validating code blocks: exec vs test" section covering: - exec="true" (build-render) vs test="true" (validate-only) and when to use each - the exec="false" reason="..." opt-out and the test_no_unvalidated_blocks guard - markers="gpu"/"s3" for infra-bound blocks - the placement rule (test-only blocks come last) + --strict CI Attribute examples are shown as inline code rather than nested ```python fences, so pytest-examples' find_examples never mistakes the teaching examples for real blocks (verified: only the two genuine blocks in contributing.md are collected). Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/contributing.md | 58 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/docs/contributing.md b/docs/contributing.md index a37768b815..e4906f6db5 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -253,6 +253,64 @@ renders as: print("Hello world") ``` +#### Validating code blocks: `exec` vs `test` + +Every Python code block in the documentation is checked by a test +(`tests/test_docs.py`) so that examples cannot quietly rot — the bug that motivated +this was an example calling `zarr.create_array(..., mode="w")`, an argument that does +not exist, which went unnoticed because nothing ran it. A block declares *how* it is +validated using one of two independent attributes: + + - **`exec="true"`** — Markdown Exec runs the block **at docs-build time to render its + output** into the page. This is the attribute described above; it is also what the + test suite executes. Use it for ordinary examples whose output should appear in the + docs. + - **`test="true"`** — the block is **run by the test suite only**, *not* at build time. + Use this for an example that should be validated but cannot run in the docs-build + environment — for example one that needs a GPU or a cloud backend. Markdown Exec + leaves a `test="true"` block as a static, syntax-highlighted snippet (it never + executes it), while the test suite still runs it (see the marker note below). + +A block may carry both (`exec="true" test="true"`), though in practice `exec="true"` +already implies it is tested, so you rarely need `test="true"` alongside it. + +The two attributes are kept separate on purpose: `exec=` controls *build-time rendering* +and `test=` controls *test-time validation*. Tagging a GPU/cloud example `exec="true"` +would make `mkdocs build` try to run it on a machine without that infrastructure and fail +the build; `test="true"` lets it be validated without being built. + +##### Opting a block out of validation + +A handful of blocks genuinely cannot run and are not executable Python — a REPL +transcript, a deliberately-incorrect "before" snippet, a `--8<--` file include. Mark +these explicitly by opening the fence with +`exec="false" reason="REPL output transcript, not executable source"` (supply a reason +that fits the block). + +`exec="false"` with a non-empty `reason` is an explicit, greppable opt-out. A test +(`test_no_unvalidated_blocks`) requires **every** Python block to be either `exec="true"`, +`test="true"`, or `exec="false"` with a reason — so a block can never silently skip +validation. A bare ` ```python ` fence, or a typo like `exec="on"`, fails that test. + +##### Marker-bound blocks (GPU, S3) + +A `test="true"` block that needs special infrastructure declares a pytest marker with +`markers="..."`, which binds it to that infrastructure in the test suite: + + - `markers="gpu"` — run only under `pytest -m gpu` (the GPU CI environment); skipped + elsewhere via `importorskip("cupy")`. + - `markers="s3"` — run against a mock S3 (moto) backend supplied by a test fixture, so + the example can use a bare `s3://…` URL with no test-only connection details on show. + +##### Placement of `test="true"` blocks + +Because Markdown Exec does not execute a `test="true"` (or `exec="false"`) block, placing +one *before* an `exec="true"` block on the same page can disrupt the build-time execution +of that later block. Put `test="true"` blocks **after** all `exec="true"` blocks on the +page (or on a page where they are the only Python block). The `test_test_only_blocks_come_last` +test enforces this, and the CI docs build runs with `--strict` so any such breakage fails +the build rather than passing as a warning. + #### Building documentation without executing code blocks Sometimes, you may want the documentation to build quicker. You can disable code block execution by commenting out the [markdown-exec plugin](https://github.com/zarr-developers/zarr-python/blob/884a8c91afcc3efe28b3da952be3b85125c453cb/mkdocs.yml#L132) in the mkdocs configuration file. This will make code blocks and cross references render incorrectly (i.e., expect build warnings), but also reduces build time by ~3x. Be sure to undo the commenting out before opening your pull request.