From 2652d9e6da41f92f3dab1a6eb7eb6bed0f817cb3 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 10:23:50 +0200
Subject: [PATCH 01/25] docs: spec for closing the silently-unexecuted
 docs-block gap (#4016)

Design for fixing issue #4016 (invalid create_array(mode="w") in docs)
and preventing recurrence: per-case remediation of the 12 non-executed
python blocks (S3 via moto, config blocks via exec="true", GPU via the
gpu marker, explicit opt-out for non-Python blocks) plus a guard test
asserting every docs python block is either executed or explicitly
opted out with a reason.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 ...2026-05-29-docs-block-validation-design.md | 182 ++++++++++++++++++
 1 file changed, 182 insertions(+)
 create mode 100644 docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
new file mode 100644
index 0000000000..db627d7528
--- /dev/null
+++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
@@ -0,0 +1,182 @@
+# Design: Close the "silently-unexecuted docs block" gap
+
+**Date:** 2026-05-29
+**Issue:** [zarr-developers/zarr-python#4016](https://github.com/zarr-developers/zarr-python/issues/4016)
+
+## Problem & root cause
+
+Issue #4016 reports invalid code in the docs:
+
+```python
+z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10), dtype="f4")
+```
+
+`create_array` has no `mode` parameter, so this raises `TypeError: unexpected keyword
+argument 'mode'`. The code was wrong because **nothing validated it**: it is a bare
+` ```python ` block, and both the renderer (`markdown-exec`) and the test suite
+(`tests/test_docs.py`, which filters on `settings.get("exec") != "true"`) only act on
+blocks tagged `exec="true"`. Omitting that attribute is a *silent* opt-out from all
+validation.
+
+This is not a one-off. An audit of all docs found **12 of 180** python blocks
+unexecuted, including a second instance of the same failure mode:
+`docs/contributing.md:231` is tagged `exec="on"` (a typo for `"true"`), so a block
+meant to run silently does not.
+
+**Root cause:** validation is opt-in via an easily-mistyped, easily-omitted attribute,
+with no signal when a block opts out.
+
+### Audit of the 12 bare blocks
+
+| Block | Why bare | Disposition |
+|---|---|---|
+| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute against moto** mock-S3 |
+| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, gated on `gpu` marker** (runs in `gputest` env) |
+| `docs/user-guide/performance.md:207` | left bare | **`exec="true"`** (plain `zarr.config.set`) |
+| `docs/user-guide/performance.md:237` | left bare | **`exec="true"`** |
+| `docs/user-guide/performance.md:263` | left bare | **`exec="true"`** (uses dask + a local array path) |
+| `docs/user-guide/arrays.md:622` | left bare | **`exec="true"`** (`zarr.config.set`) |
+| `docs/user-guide/cli.md:48` | left bare | **`exec="true"`** (`zarr.open`; needs a runnable path/store) |
+| `docs/contributing.md:231` | **`exec="on"` typo** | **Fix typo** → `exec="true"` |
+| `docs/contributing.md:15` | pseudocode (`# etc.`) | **Explicit opt-out** + reason |
+| `docs/user-guide/data_types.md:363` | REPL transcript (`<class ...>`) | **Explicit opt-out** + reason |
+| `docs/user-guide/examples/custom_dtype.md:5` | `--8<--` file include | **Explicit opt-out** + reason |
+| `docs/user-guide/v3_migration.md:42` | intentionally-wrong import | **Explicit opt-out** + reason |
+
+(`performance.md:263` and `cli.md:48` need a small adjustment — a memory store or a
+real local path — to be runnable; confirm during implementation.)
+
+## Approach
+
+Two complementary parts.
+
+### Part A — Per-case remediation of the 12 bare blocks
+
+Not one mechanism — a triage. Each block gets the treatment that fits *why* it is not
+executing:
+
+- **Make executable against fakes** — the S3 example. Reuse the repo's existing `moto`
+  mock-S3 pattern from `tests/test_store/test_fsspec.py` so the block runs for real in
+  CI with no real-cloud contact. Execution validates the whole write path, not just the
+  signature; `mode="w"` dies by construction.
+- **Just turn on** — the config/open blocks (`performance.md` ×3, `arrays.md:622`,
+  `cli.md:48`) are plain runnable API calls; flip them to `exec="true"`.
+- **Fix the typo** — `contributing.md:231` `exec="on"` → `exec="true"`.
+- **Execute, env-gated** — the GPU block. It *can* run, but only in the `gputest` env
+  (cupy + GPU hardware), not the default `doctest` env. See "Env-gated execution".
+- **Explicit opt-out** — blocks that genuinely cannot run anywhere and are not
+  executable Python: REPL transcript, `--8<--` include, intentionally-wrong import,
+  pseudocode. These get a *documented, greppable* opt-out marker carrying a reason.
+
+### Part B — A guard test
+
+So the gap cannot silently reopen: every python block in `docs/` must either be
+`exec="true"` *or* carry the explicit opt-out marker with a reason. A bare or
+mistyped block fails the guard. This would have caught both `mode="w"` and the
+`exec="on"` typo.
+
+### Dropped from scope
+
+The type-checking / markdown-extractor machinery considered earlier. Execution-against-
+fakes strictly dominates type-checking for the cloud case (and the untyped `s3fs`/`cupy`
+imports make strict type-checking least clean exactly where it was wanted most), and the
+guard handles everything else. Proportionate to ~7 genuinely-affected blocks.
+
+## Key insight: doctests are already pytest tests
+
+`tests/test_docs.py::test_documentation_examples` is an ordinary `@pytest.mark.parametrize`d
+pytest test — one case per `(file, session)`. It is not a separate doctest mechanism.
+Therefore everything pytest already provides for gating tests (markers, `-m` selection,
+skips) is available; the design uses it rather than inventing harness concepts.
+
+There are two distinct executors of docs blocks, and conflating them is what made
+env-gating look hard:
+
+- **`markdown-exec` at docs-build time** — runs blocks to render output into the
+  published site. Build runners have no cupy, so a GPU block must render as static
+  source here (no build-time execution).
+- **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is
+  where markers live and where env-gating happens.
+
+## Env-gated execution
+
+A block declares the pytest marker it needs via a **fence attribute**, e.g.:
+
+````
+```python exec="true" markers="gpu"
+````
+
+`group_examples_by_session()` parses `markers=` and emits
+`pytest.param(session_key, marks=pytest.mark.gpu)`. Then:
+
+- Default `doctest` env runs `pytest` → the gpu-marked param is **skipped/deselected**,
+  exactly like every other `gpu`-marked test in `tests/`.
+- The `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy.
+
+This reuses the existing `gpu` marker (`pyproject.toml`, `markers` table) and the existing
+`pytest -m gpu` selection — no new harness concept.
+
+## Components & data flow
+
+**`docs/` markdown** — source of truth. Each python block is in one of three declared
+states:
+
+1. `exec="true"` (optionally `+ markers="<m>"`) — executed as a test.
+2. explicit opt-out marker **with a reason** — deliberately not executed.
+3. anything else (bare, `exec="on"`, …) — **illegal**, fails the guard.
+
+The exact spelling of the opt-out marker (e.g. `exec="false"` plus a `reason="..."`
+attribute, versus a dedicated sentinel attribute) is an implementation-plan decision.
+Requirement: it must be explicit, greppable, carry a human-readable reason, and be a
+form `markdown-exec` will not execute at build time.
+
+**`tests/test_docs.py`** — already-parametrized pytest harness. Changes:
+
+- `group_examples_by_session()` parses the `markers=` attribute and emits
+  `pytest.param(..., marks=pytest.mark.<m>)` so env-gating rides existing marker machinery.
+- New guard test `test_no_unvalidated_blocks` — walks every python block in `docs/`,
+  asserts each is `exec="true"` or carries the explicit opt-out marker. Fails on
+  bare/typo'd blocks.
+
+**`docs/quick-start.md` S3 session** — a hidden setup block (`exec="true"`, no `source=`,
+matching the existing setup block at `quick-start.md:8`) starts a `moto` server and
+registers a default endpoint so the *visible* `create_array("s3://...")` block runs
+against the fake. Pattern lifted from `tests/test_store/test_fsspec.py`.
+
+## Risks & spikes (resolve during implementation; do not guess)
+
+1. **Default S3 endpoint without `storage_options`.** Existing tests always pass
+   `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm a setup block can register
+   a *process-wide* default endpoint (via `fsspec.config` or `AWS_ENDPOINT_URL`) so the
+   visible `create_array("s3://...")` works clean. **Fallback:** show the honest
+   `storage_options={"endpoint_url": ...}` form in the visible block.
+
+2. **`markdown-exec` + unknown `markers=` attribute.** Confirm the build-time renderer
+   ignores `markers=` (or is told to), and decide how a `gpu` block renders in the
+   published site without cupy (render source only, no build-time execution).
+   **Fallback:** a per-session marker map in `test_docs.py`, keeping markdown untouched.
+
+3. **moto teardown in the docs session.** `s3fs`/`aiobotocore` finalizers are known to be
+   noisy at teardown (see the filterwarnings note in `pyproject.toml`). Ensure the docs
+   session's moto server starts/stops cleanly without leaking into other sessions.
+
+## Testing the change
+
+- Guard test is self-validating: after remediation, the full docs suite passes with zero
+  bare/typo'd blocks.
+- Negative check: temporarily introduce a bare block, confirm the guard fails, remove it.
+- S3 block: `hatch run doctest:test` runs it green against moto.
+- GPU block: `pytest -m gpu` in `gputest` executes it; the default `doctest` run reports
+  it **skipped**, not absent.
+
+## Out of scope
+
+- Type-checking machinery / markdown extractor.
+- The 168 already-executing blocks.
+- Broad docs rewrites beyond the 12 bare blocks.
+
+## Upstream
+
+A separate GitHub issue will capture the root-cause framing (silent opt-out hides bugs;
+`mode="w"` and `exec="on"` as two instances) and the Part B guard proposal for community
+discussion, independent of the immediate fix.

From 33590eba2dc0a66cfe34f3616365733207a695cf Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 10:33:00 +0200
Subject: [PATCH 02/25] docs: unify docs-block marker model (s3 + gpu) in #4016
 spec

Replace the bespoke hidden-moto-setup-block approach for the S3 example
with a marker-bound model: a block declares markers="s3"/"gpu" on the
fence, and the harness binds each marker to the infra/env it needs
(s3 -> moto fixture in the default doctest env; gpu -> gputest env via
pytest -m gpu). Symmetric declaration; the asymmetry is only in what
each marker resolves to.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 ...2026-05-29-docs-block-validation-design.md | 98 ++++++++++++-------
 1 file changed, 62 insertions(+), 36 deletions(-)

diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
index db627d7528..f46b351856 100644
--- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
+++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
@@ -30,8 +30,8 @@ with no signal when a block opts out.
 
 | Block | Why bare | Disposition |
 |---|---|---|
-| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute against moto** mock-S3 |
-| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, gated on `gpu` marker** (runs in `gputest` env) |
+| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute, `markers="s3"`** (moto infra, default doctest env) |
+| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, `markers="gpu"`** (runs in `gputest` env) |
 | `docs/user-guide/performance.md:207` | left bare | **`exec="true"`** (plain `zarr.config.set`) |
 | `docs/user-guide/performance.md:237` | left bare | **`exec="true"`** |
 | `docs/user-guide/performance.md:263` | left bare | **`exec="true"`** (uses dask + a local array path) |
@@ -55,15 +55,17 @@ Two complementary parts.
 Not one mechanism — a triage. Each block gets the treatment that fits *why* it is not
 executing:
 
-- **Make executable against fakes** — the S3 example. Reuse the repo's existing `moto`
-  mock-S3 pattern from `tests/test_store/test_fsspec.py` so the block runs for real in
-  CI with no real-cloud contact. Execution validates the whole write path, not just the
-  signature; `mode="w"` dies by construction.
+- **Make executable against fakes** — the S3 example, via `markers="s3"`. The marker
+  binds the block to the repo's existing `moto` mock-S3 infra (pattern from
+  `tests/test_store/test_fsspec.py`) so it runs for real in CI with no real-cloud
+  contact. Execution validates the whole write path, not just the signature; `mode="w"`
+  dies by construction. See "Marker-bound execution".
 - **Just turn on** — the config/open blocks (`performance.md` ×3, `arrays.md:622`,
   `cli.md:48`) are plain runnable API calls; flip them to `exec="true"`.
 - **Fix the typo** — `contributing.md:231` `exec="on"` → `exec="true"`.
-- **Execute, env-gated** — the GPU block. It *can* run, but only in the `gputest` env
-  (cupy + GPU hardware), not the default `doctest` env. See "Env-gated execution".
+- **Execute, env-gated** — the GPU block, via `markers="gpu"`. It *can* run, but only in
+  the `gputest` env (cupy + GPU hardware), not the default `doctest` env. See
+  "Marker-bound execution".
 - **Explicit opt-out** — blocks that genuinely cannot run anywhere and are not
   executable Python: REPL transcript, `--8<--` include, intentionally-wrong import,
   pseudocode. These get a *documented, greppable* opt-out marker carrying a reason.
@@ -90,31 +92,45 @@ Therefore everything pytest already provides for gating tests (markers, `-m` sel
 skips) is available; the design uses it rather than inventing harness concepts.
 
 There are two distinct executors of docs blocks, and conflating them is what made
-env-gating look hard:
+marker-bound execution look hard:
 
 - **`markdown-exec` at docs-build time** — runs blocks to render output into the
-  published site. Build runners have no cupy, so a GPU block must render as static
-  source here (no build-time execution).
+  published site. Build runners have no cupy (and the S3 setup is test infra), so a
+  marker-bound block must render as static source here (no build-time execution).
 - **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is
-  where markers live and where env-gating happens.
+  where markers live, where infra fixtures bind, and where env-gating happens.
 
-## Env-gated execution
+## Marker-bound execution
 
 A block declares the pytest marker it needs via a **fence attribute**, e.g.:
 
 ````
 ```python exec="true" markers="gpu"
+```python exec="true" markers="s3"
 ````
 
 `group_examples_by_session()` parses `markers=` and emits
-`pytest.param(session_key, marks=pytest.mark.gpu)`. Then:
-
-- Default `doctest` env runs `pytest` → the gpu-marked param is **skipped/deselected**,
-  exactly like every other `gpu`-marked test in `tests/`.
-- The `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy.
-
-This reuses the existing `gpu` marker (`pyproject.toml`, `markers` table) and the existing
-`pytest -m gpu` selection — no new harness concept.
+`pytest.param(session_key, marks=pytest.mark.<m>)`. The marker then **binds the case to
+whatever that marker means** — and the two markers mean different things, which is the
+point of unifying the model rather than special-casing each:
+
+- **`gpu` — env-gate.** Default `doctest` env runs `pytest` → the gpu-marked param is
+  **skipped/deselected**, exactly like every other `gpu`-marked test in `tests/`. The
+  `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy. Reuses
+  the existing `gpu` marker (`pyproject.toml` `markers` table) and `pytest -m gpu`
+  selection — no new harness concept.
+
+- **`s3` — infra-binding.** A new `s3` marker (must be registered in the `markers`
+  table). An autouse-style fixture keyed on the marker stands up the `moto` server and
+  registers a default endpoint, so an `s3`-marked docs case runs against the fake S3
+  with no real-cloud contact. Because the infra is just pip deps already present in the
+  `doctest` env (`s3fs`, `moto[s3,server]`), the case **runs in the default doctest
+  run** — the marker binds infra, it does not gate the case out. The moto/endpoint
+  plumbing lives in named pytest fixtures, not a hidden markdown setup block.
+
+Both blocks therefore follow one rule: *declare the marker; the harness binds the marker
+to the infra/env it needs.* The asymmetry is in what each marker resolves to (gpu →
+hardware env, s3 → fixture), not in the declaration mechanism.
 
 ## Components & data flow
 
@@ -133,39 +149,49 @@ form `markdown-exec` will not execute at build time.
 **`tests/test_docs.py`** — already-parametrized pytest harness. Changes:
 
 - `group_examples_by_session()` parses the `markers=` attribute and emits
-  `pytest.param(..., marks=pytest.mark.<m>)` so env-gating rides existing marker machinery.
+  `pytest.param(..., marks=pytest.mark.<m>)` so marker-binding rides existing marker
+  machinery.
+- A marker-keyed fixture for `s3` that stands up the `moto` server and registers a
+  default endpoint (pattern lifted from `tests/test_store/test_fsspec.py`), applied to
+  `s3`-marked docs cases.
 - New guard test `test_no_unvalidated_blocks` — walks every python block in `docs/`,
   asserts each is `exec="true"` or carries the explicit opt-out marker. Fails on
   bare/typo'd blocks.
 
-**`docs/quick-start.md` S3 session** — a hidden setup block (`exec="true"`, no `source=`,
-matching the existing setup block at `quick-start.md:8`) starts a `moto` server and
-registers a default endpoint so the *visible* `create_array("s3://...")` block runs
-against the fake. Pattern lifted from `tests/test_store/test_fsspec.py`.
+**`pyproject.toml`** — register the new `s3` marker in the `markers` table (alongside
+`gpu`).
+
+**`docs/quick-start.md` S3 block** — gains `markers="s3"`. The visible code stays a clean
+`create_array("s3://...")`; the moto server and default-endpoint registration are
+supplied by the `s3` fixture, not by an in-markdown setup block.
 
 ## Risks & spikes (resolve during implementation; do not guess)
 
 1. **Default S3 endpoint without `storage_options`.** Existing tests always pass
-   `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm a setup block can register
+   `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm the `s3` fixture can register
    a *process-wide* default endpoint (via `fsspec.config` or `AWS_ENDPOINT_URL`) so the
-   visible `create_array("s3://...")` works clean. **Fallback:** show the honest
-   `storage_options={"endpoint_url": ...}` form in the visible block.
+   visible `create_array("s3://...")` works clean with no `storage_options`. **Fallback:**
+   show the honest `storage_options={"endpoint_url": ...}` form in the visible block.
 
 2. **`markdown-exec` + unknown `markers=` attribute.** Confirm the build-time renderer
-   ignores `markers=` (or is told to), and decide how a `gpu` block renders in the
-   published site without cupy (render source only, no build-time execution).
-   **Fallback:** a per-session marker map in `test_docs.py`, keeping markdown untouched.
+   ignores `markers=` (or is told to), and that marker-bound blocks render as static
+   source in the published site (render source only, no build-time execution — the build
+   has neither cupy nor the moto fixture). **Fallback:** a per-session marker map in
+   `test_docs.py`, keeping markdown untouched.
 
-3. **moto teardown in the docs session.** `s3fs`/`aiobotocore` finalizers are known to be
-   noisy at teardown (see the filterwarnings note in `pyproject.toml`). Ensure the docs
-   session's moto server starts/stops cleanly without leaking into other sessions.
+3. **moto teardown / loop affinity in the docs session.** `s3fs`/`aiobotocore` finalizers
+   are noisy at teardown and s3fs instances bind to the event loop they were created on
+   (see the filterwarnings note in `pyproject.toml` and the loop comments in
+   `test_fsspec.py`). Ensure the docs `s3` fixture starts/stops moto cleanly and does not
+   leak across sessions/tests.
 
 ## Testing the change
 
 - Guard test is self-validating: after remediation, the full docs suite passes with zero
   bare/typo'd blocks.
 - Negative check: temporarily introduce a bare block, confirm the guard fails, remove it.
-- S3 block: `hatch run doctest:test` runs it green against moto.
+- S3 block: `hatch run doctest:test` runs it green against moto in the default doctest
+  env (the `s3` marker binds the fixture; it is not gated out).
 - GPU block: `pytest -m gpu` in `gputest` executes it; the default `doctest` run reports
   it **skipped**, not absent.
 

From 47d20c32b24b3ed327a6c486ac0d3b86b757fbf6 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 10:39:40 +0200
Subject: [PATCH 03/25] docs: link spec to upstream issue #4017

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../specs/2026-05-29-docs-block-validation-design.md       | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
index f46b351856..4e8232846f 100644
--- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
+++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
@@ -203,6 +203,7 @@ supplied by the `s3` fixture, not by an in-markdown setup block.
 
 ## Upstream
 
-A separate GitHub issue will capture the root-cause framing (silent opt-out hides bugs;
-`mode="w"` and `exec="on"` as two instances) and the Part B guard proposal for community
-discussion, independent of the immediate fix.
+[zarr-developers/zarr-python#4017](https://github.com/zarr-developers/zarr-python/issues/4017)
+captures the root-cause framing (silent opt-out hides bugs; `mode="w"` and `exec="on"` as
+two instances) and the Part B guard proposal for community discussion, independent of the
+immediate fix in #4016.

From c660818b2ae58983311a2d5b8a906eacdbe7feb0 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 10:49:29 +0200
Subject: [PATCH 04/25] docs: implementation plan for docs-block validation
 (#4016, #4017)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../plans/2026-05-29-docs-block-validation.md | 722 ++++++++++++++++++
 1 file changed, 722 insertions(+)
 create mode 100644 docs/superpowers/plans/2026-05-29-docs-block-validation.md

diff --git a/docs/superpowers/plans/2026-05-29-docs-block-validation.md b/docs/superpowers/plans/2026-05-29-docs-block-validation.md
new file mode 100644
index 0000000000..1c34652012
--- /dev/null
+++ b/docs/superpowers/plans/2026-05-29-docs-block-validation.md
@@ -0,0 +1,722 @@
+# Docs Block Validation Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Make every python code block in `docs/` either execute (and thus get validated) or explicitly opt out with a documented reason, and add a guard test so a block can never again silently opt out of validation.
+
+**Architecture:** The doctests in `tests/test_docs.py` are already parametrized pytest tests. We (1) teach the parametrizer to read a `markers="..."` fence attribute and attach the matching pytest marker to each session's `pytest.param`, (2) add an `s3` marker bound to a `moto` mock-S3 fixture so the S3 example runs in the default doctest env, (3) reuse the existing `gpu` marker for the GPU block, (4) remediate the 12 currently-unexecuted blocks per-case, and (5) add a guard test asserting every docs python block is `exec="true"` or explicitly opted out with a reason.
+
+**Tech Stack:** pytest, pytest-examples, markdown-exec (mkdocs), moto[s3,server], s3fs, hatch envs (`doctest`, `gputest`).
+
+**Upstream:** Fixes [#4016](https://github.com/zarr-developers/zarr-python/issues/4016); implements the guard from [#4017](https://github.com/zarr-developers/zarr-python/issues/4017). Design spec: `docs/superpowers/specs/2026-05-29-docs-block-validation-design.md`.
+
+---
+
+## File Structure
+
+- `tests/test_docs.py` — **modify.** Add `markers=` parsing in `group_examples_by_session()`, an `s3` fixture + marker-binding, and the new `test_no_unvalidated_blocks` guard test.
+- `pyproject.toml` — **modify.** Register the `s3` marker in `[tool.pytest.ini_options] markers`.
+- `docs/quick-start.md` — **modify.** S3 block: fix `mode="w"`, add `markers="s3"`, make it executable.
+- `docs/user-guide/performance.md` — **modify.** Turn on the two config-only blocks; opt out (or fix) the dask block.
+- `docs/user-guide/arrays.md` — **modify.** Turn on the config block.
+- `docs/user-guide/cli.md` — **modify.** Make the `zarr.open` block runnable or opt it out.
+- `docs/user-guide/gpu.md` — **modify.** Add `exec="true" markers="gpu"`.
+- `docs/contributing.md` — **modify.** Fix `exec="on"` typo; opt out the pseudocode block.
+- `docs/user-guide/data_types.md` — **modify.** Opt out the REPL-transcript block.
+- `docs/user-guide/examples/custom_dtype.md` — **modify.** Opt out the `--8<--` include block.
+- `docs/user-guide/v3_migration.md` — **modify.** Opt out the intentionally-wrong-import block.
+- `changes/4016.bugfix.md` — **create.** Towncrier news fragment.
+
+### Opt-out convention (decided here, used throughout)
+
+A block that must not execute is tagged:
+
+````
+```python exec="false" reason="<human-readable reason>"
+````
+
+- `exec="false"` is an explicit, greppable opt-out that `markdown-exec` will **not** execute (only `exec="true"` triggers execution).
+- `reason="..."` documents *why*. The guard test requires it on any non-`exec="true"` block.
+
+---
+
+## Task 1: Spike — can the `s3` fixture provide a default endpoint with no `storage_options`?
+
+This is the load-bearing unknown. The existing S3 tests always pass `endpoint_url` explicitly via `client_kwargs`/`storage_options` (`tests/test_store/test_fsspec.py:109-116, 131`). The docs block must read clean — `zarr.create_array("s3://...")` with **no** `storage_options`. We must confirm a process-wide default endpoint works before writing the real fixture.
+
+**Files:**
+- Test (scratch): `tests/test_docs_s3_spike.py` (deleted at end of task)
+
+- [ ] **Step 1: Write a scratch test that starts moto, sets a default endpoint via env, and creates an array with a bare `s3://` URL**
+
+```python
+# tests/test_docs_s3_spike.py
+import os
+
+import pytest
+
+moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
+pytest.importorskip("s3fs")
+botocore = pytest.importorskip("botocore")
+requests = pytest.importorskip("requests")
+
+PORT = 5556  # different from test_fsspec.py's 5555 to avoid collisions
+ENDPOINT = f"http://127.0.0.1:{PORT}/"
+
+
+def test_bare_s3_url_with_default_endpoint() -> None:
+    """A create_array('s3://...') call with no storage_options should reach a
+    moto server when the endpoint is configured process-wide (env var)."""
+    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=PORT)
+    server.start()
+    try:
+        os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
+        os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
+        # Candidate mechanism A: aiobotocore/botocore honors AWS_ENDPOINT_URL
+        os.environ["AWS_ENDPOINT_URL"] = ENDPOINT
+
+        # create the bucket via boto3 sync client
+        session = botocore.session.Session()
+        client = session.create_client("s3", endpoint_url=ENDPOINT, region_name="us-east-1")
+        client.create_bucket(Bucket="docs-bucket")
+        client.close()
+
+        import s3fs
+
+        import zarr
+
+        s3fs.S3FileSystem.clear_instance_cache()
+        z = zarr.create_array(
+            "s3://docs-bucket/foo", shape=(8, 8), chunks=(4, 4), dtype="f4"
+        )
+        z[:, :] = 1.0
+        assert z[0, 0] == 1.0
+    finally:
+        requests.post(f"{ENDPOINT}/moto-api/reset")
+        server.stop()
+```
+
+- [ ] **Step 2: Run the spike**
+
+Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v` (or `uv run pytest tests/test_docs_s3_spike.py -v` inside the doctest env)
+Expected: **One of two outcomes** — record which:
+- **PASS** → `AWS_ENDPOINT_URL` works as a process-wide default. Use env-var mechanism in Task 3.
+- **FAIL** (connection refused / NoCredentials / hits real AWS) → env var insufficient. Try candidate B below.
+
+- [ ] **Step 3: If Step 2 failed, try fsspec default config**
+
+Replace the `AWS_ENDPOINT_URL` line with:
+
+```python
+        import fsspec
+
+        fsspec.config.conf["s3"] = {"client_kwargs": {"endpoint_url": ENDPOINT}, "anon": False}
+```
+
+Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v`
+Expected: PASS → use `fsspec.config.conf` mechanism in Task 3.
+
+- [ ] **Step 4: If both failed, record the fallback decision**
+
+If neither bare-URL mechanism works, the visible block will show `storage_options={"endpoint_url": ...}` honestly (spec fallback for spike #1). Note which mechanism (env var, fsspec config, or fallback) won, in the commit message — Task 3 depends on it.
+
+- [ ] **Step 5: Delete the scratch test and commit the finding**
+
+```bash
+git rm tests/test_docs_s3_spike.py
+git commit -m "test: spike s3 default-endpoint mechanism for docs (no storage_options)
+
+Result: <env-var | fsspec-config | fallback-to-storage_options>"
+```
+
+---
+
+## Task 2: Register the `s3` pytest marker
+
+**Files:**
+- Modify: `pyproject.toml` (the `[tool.pytest.ini_options]` `markers` list, currently at lines 446-450)
+
+- [ ] **Step 1: Add the `s3` marker**
+
+In `pyproject.toml`, change the `markers` list from:
+
+```toml
+markers = [
+    "asyncio: mark test as asyncio test",
+    "gpu: mark a test as requiring CuPy and GPU",
+    "slow_hypothesis: slow hypothesis tests",
+]
+```
+
+to:
+
+```toml
+markers = [
+    "asyncio: mark test as asyncio test",
+    "gpu: mark a test as requiring CuPy and GPU",
+    "s3: mark a test as requiring a (mock) S3 backend via moto",
+    "slow_hypothesis: slow hypothesis tests",
+]
+```
+
+- [ ] **Step 2: Verify pytest accepts the marker (no unknown-marker warning)**
+
+Run: `hatch run doctest:test --markers | grep s3`
+Expected: shows `@pytest.mark.s3: mark a test as requiring a (mock) S3 backend via moto`
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add pyproject.toml
+git commit -m "test: register s3 pytest marker"
+```
+
+---
+
+## Task 3: Teach `test_docs.py` to parse `markers=` and bind the `s3` fixture
+
+This task adds (a) `markers=` parsing so a session carries the right pytest marker, and (b) the moto-backed `s3` fixture using the mechanism chosen in Task 1.
+
+**Files:**
+- Modify: `tests/test_docs.py`
+
+- [ ] **Step 1: Write a failing test that a markered session carries its marker**
+
+Add to `tests/test_docs.py`:
+
+```python
+def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
+    """A block tagged markers="s3" must surface that marker on its parametrized case,
+    so pytest can gate/bind it (e.g. attach the moto fixture)."""
+    md = tmp_path / "ex.md"
+    md.write_text(
+        '```python exec="true" session="demo" markers="s3"\n'
+        "import zarr\n"
+        "```\n",
+        encoding="utf-8",
+    )
+    params = _session_params(md.parent)
+    assert len(params) == 1
+    marks = params[0].marks
+    assert any(m.name == "s3" for m in marks)
+```
+
+(This references a new helper `_session_params(root)` that returns a list of `pytest.param(...)`; we extract the grouping logic into it in Step 3.)
+
+- [ ] **Step 2: Run it to confirm it fails**
+
+Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v`
+Expected: FAIL with `AttributeError: module ... has no attribute '_session_params'` (or `NameError`).
+
+- [ ] **Step 3: Refactor grouping into `_session_params` that emits markers**
+
+Replace `group_examples_by_session()` (currently `tests/test_docs.py:39-64`) and the parametrize decorator (`tests/test_docs.py:72-75`) with a version that returns `pytest.param` objects carrying marks. Add near the top of the file:
+
+```python
+def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]:
+    """Translate a block's markers="a b" attribute into pytest mark decorators."""
+    raw = settings.get("markers", "")
+    return [getattr(pytest.mark, name) for name in raw.split() if name]
+
+
+def _session_params(root: Path) -> list[pytest.param]:
+    """Group exec="true" examples by (file, session) and emit one pytest.param per
+    session, carrying the union of markers declared by that session's blocks."""
+    sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list)
+    marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set)
+
+    for example in find_examples(str(root)):
+        settings = example.prefix_settings()
+        if settings.get("exec") != "true":
+            continue
+        session_name = settings.get("session", "_default")
+        key = (str(example.path), session_name)
+        sessions[key].append(example)
+        for mark in _markers_for(settings):
+            marks_by_session[key].add(mark.name)
+
+    params = []
+    for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])):
+        marks = tuple(getattr(pytest.mark, name) for name in sorted(marks_by_session[key]))
+        params.append(pytest.param(key, marks=marks, id=name_example(key[0], key[1])))
+    return params
+```
+
+Keep `name_example()` as-is. Add `CodeExample` to the existing pytest-examples import if not already imported (it is: `from pytest_examples import CodeExample, EvalExample, find_examples`).
+
+- [ ] **Step 4: Update the parametrized test to use `_session_params` and request the fixtures**
+
+Replace the decorator + signature of `test_documentation_examples` (`tests/test_docs.py:72-79`) with:
+
+```python
+@pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT))
+def test_documentation_examples(
+    session_key: tuple[str, str],
+    eval_example: EvalExample,
+    request: pytest.FixtureRequest,
+) -> None:
+```
+
+Inside the body, before running examples, activate the `s3` fixture when the case is s3-marked:
+
+```python
+    if request.node.get_closest_marker("s3") is not None:
+        request.getfixturevalue("docs_s3_backend")
+```
+
+(Leave the rest of the body — the `find_examples` loop and `eval_example.run(...)` — unchanged.)
+
+- [ ] **Step 5: Add the `docs_s3_backend` fixture**
+
+Add to `tests/test_docs.py` (using the mechanism Task 1 selected — shown here for the `AWS_ENDPOINT_URL` variant; swap to `fsspec.config` or the `storage_options` fallback per Task 1's result):
+
+```python
+S3_PORT = 5556
+S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/"
+S3_BUCKET = "example-bucket"
+
+
+@pytest.fixture
+def docs_s3_backend() -> Generator[None, None, None]:
+    """Stand up a moto mock-S3 server and configure a process-wide default endpoint
+    so docs blocks can use a bare s3:// URL with no storage_options."""
+    moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
+    s3fs = pytest.importorskip("s3fs")
+    botocore = pytest.importorskip("botocore")
+    requests = pytest.importorskip("requests")
+
+    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT)
+    server.start()
+    prev_endpoint = os.environ.get("AWS_ENDPOINT_URL")
+    os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
+    os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
+    os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT
+
+    session = botocore.session.Session()
+    client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1")
+    client.create_bucket(Bucket=S3_BUCKET)
+    client.close()
+    s3fs.S3FileSystem.clear_instance_cache()
+    try:
+        yield
+    finally:
+        requests.post(f"{S3_ENDPOINT}/moto-api/reset")
+        if prev_endpoint is None:
+            os.environ.pop("AWS_ENDPOINT_URL", None)
+        else:
+            os.environ["AWS_ENDPOINT_URL"] = prev_endpoint
+        server.stop()
+```
+
+Add the required imports at the top of `tests/test_docs.py`:
+
+```python
+import os
+from collections.abc import Generator
+```
+
+- [ ] **Step 6: Run the marker-parsing test — it should now pass**
+
+Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v`
+Expected: PASS
+
+- [ ] **Step 7: Run the full docs test to confirm no regression in existing sessions**
+
+Run: `hatch run doctest:test -v`
+Expected: PASS for all existing `quickstart` etc. sessions (the S3 block isn't markered yet — that's Task 4).
+
+- [ ] **Step 8: Commit**
+
+```bash
+git add tests/test_docs.py
+git commit -m "test: parse markers= on docs blocks and add moto s3 fixture binding"
+```
+
+---
+
+## Task 4: Fix and enable the S3 example (#4016)
+
+**Files:**
+- Modify: `docs/quick-start.md:134-140`
+
+- [ ] **Step 1: Replace the bare, invalid S3 block**
+
+Replace lines 134-140 (the ```` ```python `` … ```` block containing `mode="w"`) with:
+
+````markdown
+```python exec="true" session="s3demo" markers="s3" source="above"
+import zarr
+import numpy as np
+
+z = zarr.create_array(
+    "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4"
+)
+z[:, :] = np.random.random((100, 100))
+```
+````
+
+Notes:
+- `mode="w"` removed (the #4016 bug; `create_array` has no `mode` parameter — see `src/zarr/api/synchronous.py:799`).
+- Unused `import s3fs` removed.
+- `import numpy as np` added — this is a fresh `s3demo` session, so `np` is not in scope from the `quickstart` session.
+- New session `s3demo` keeps the moto fixture scoped to just this block (the `quickstart` session must NOT become s3-marked).
+- The displayed URL stays `s3://example-bucket/foo`; the moto endpoint is supplied by the `docs_s3_backend` fixture (bucket name `example-bucket` matches `S3_BUCKET` in Task 3).
+- **If Task 1 chose the `storage_options` fallback:** add `storage_options={"endpoint_url": "..."}` to the visible call instead, and adjust the prose to explain it.
+
+- [ ] **Step 2: Run the S3 docs example against moto**
+
+Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[quick-start.md:s3demo]" -v`
+Expected: PASS (executes against moto; no real-cloud contact).
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/quick-start.md
+git commit -m "docs: fix invalid s3 create_array example and run it against moto (#4016)"
+```
+
+---
+
+## Task 5: Enable the config-only blocks
+
+These are plain `zarr.config.set(...)` calls that run as-is. Each gets its own self-contained session so config mutations don't bleed into other examples (config is process-global; reset is out of scope — separate sessions keep ids distinct but note config is not auto-restored, which is acceptable for these read-only-style demos).
+
+**Files:**
+- Modify: `docs/user-guide/performance.md:207`, `docs/user-guide/performance.md:237`
+- Modify: `docs/user-guide/arrays.md:622`
+
+- [ ] **Step 1: Enable `performance.md:207` (concurrency config)**
+
+Change the fence from ```` ```python ```` to:
+
+````markdown
+```python exec="true" session="perf-concurrency"
+````
+
+(Body unchanged — `import zarr` + `zarr.config.set({'async.concurrency': 128})` + the commented env-var line, which is inert.)
+
+- [ ] **Step 2: Enable `performance.md:237` (max_workers config)**
+
+Change the fence to:
+
+````markdown
+```python exec="true" session="perf-workers"
+````
+
+- [ ] **Step 3: Enable `arrays.md:622` (rectilinear_chunks config)**
+
+Change the fence to:
+
+````markdown
+```python exec="true" session="arrays-rectilinear"
+````
+
+- [ ] **Step 4: Run the three sessions**
+
+Run:
+```bash
+hatch run doctest:test \
+  "tests/test_docs.py::test_documentation_examples[performance.md:perf-concurrency]" \
+  "tests/test_docs.py::test_documentation_examples[performance.md:perf-workers]" \
+  "tests/test_docs.py::test_documentation_examples[arrays.md:arrays-rectilinear]" -v
+```
+Expected: PASS (3 passed).
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add docs/user-guide/performance.md docs/user-guide/arrays.md
+git commit -m "docs: execute config-setting examples in performance.md and arrays.md"
+```
+
+---
+
+## Task 6: Make the CLI `zarr.open` block runnable
+
+`docs/user-guide/cli.md:48` opens `'path/to/input.zarr'` which doesn't exist. Rewrite it to create then open a real local array so it executes and still illustrates `zarr_format=3`.
+
+**Files:**
+- Modify: `docs/user-guide/cli.md:46-51`
+
+- [ ] **Step 1: Replace the block**
+
+Replace the bare block with:
+
+````markdown
+```python exec="true" session="cli-open" source="above"
+import zarr
+
+# create a small array to open (stands in for the migrated store)
+zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4")
+
+zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3)
+```
+````
+
+(Keep the surrounding prose; the example now demonstrates `open(..., zarr_format=3)` on a real store. The illustrative `'path/to/input.zarr'` filename was the only reason it couldn't run.)
+
+- [ ] **Step 2: Run it**
+
+Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[cli.md:cli-open]" -v`
+Expected: PASS
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/user-guide/cli.md
+git commit -m "docs: make cli zarr.open example runnable against a local store"
+```
+
+---
+
+## Task 7: Enable the GPU block (env-gated via `gpu` marker)
+
+**Files:**
+- Modify: `docs/user-guide/gpu.md:19-28`
+
+- [ ] **Step 1: Tag the GPU block**
+
+Change the fence from ```` ```python ```` to:
+
+````markdown
+```python exec="true" session="gpu-demo" markers="gpu" source="above"
+````
+
+(Body unchanged: `import cupy as cp`, `zarr.config.enable_gpu()`, `create_array("memory://gpu-demo", ...)`, etc.)
+
+- [ ] **Step 2: Confirm it is SKIPPED in the default doctest env (no GPU)**
+
+Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[gpu.md:gpu-demo]" -v`
+Expected: SKIPPED (the `gpu` marker is deselected without `-m gpu`), **not** an error, **not** absent.
+
+- [ ] **Step 3: Confirm it is COLLECTED for the gpu selection**
+
+Run: `hatch run doctest:test -m gpu --co -q | grep gpu-demo`
+Expected: the `gpu.md:gpu-demo` case is collected (it will actually execute only on real GPU hardware in the `gputest` env, which we can't run here).
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add docs/user-guide/gpu.md
+git commit -m "docs: execute gpu example under the gpu marker"
+```
+
+---
+
+## Task 8: Fix the `exec="on"` typo and opt out the genuinely-non-executable blocks
+
+**Files:**
+- Modify: `docs/contributing.md:15` (pseudocode) and `docs/contributing.md:231` (`exec="on"` typo)
+- Modify: `docs/user-guide/data_types.md:363` (REPL transcript)
+- Modify: `docs/user-guide/examples/custom_dtype.md:5` (`--8<--` include)
+- Modify: `docs/user-guide/v3_migration.md:42` (intentionally-wrong import)
+
+- [ ] **Step 1: Fix the `exec="on"` typo in `contributing.md:231`**
+
+Change the fence attribute `exec="on"` to `exec="true"`. Then run that block to confirm it actually executes cleanly:
+
+Run: `hatch run doctest:test -v -k contributing`
+Expected: the formerly-`exec="on"` block now runs. **If it fails** (the code was broken too, having never run), fix the code in the block minimally so it passes, or — if it's not meant to run — convert it to `exec="false" reason="..."`. Record which in the commit.
+
+- [ ] **Step 2: Opt out `contributing.md:15` (pseudocode)**
+
+Change ```` ```python ```` to:
+
+````markdown
+```python exec="false" reason="illustrative pseudocode with a '# etc.' placeholder, not runnable"
+````
+
+- [ ] **Step 3: Opt out `data_types.md:363` (REPL transcript)**
+
+Change ```` ```python ```` to:
+
+````markdown
+```python exec="false" reason="REPL output transcript, not executable source"
+````
+
+- [ ] **Step 4: Opt out `custom_dtype.md:5` (`--8<--` include)**
+
+Change ```` ```python ```` to:
+
+````markdown
+```python exec="false" reason="pymdownx snippet include directive, not python source"
+````
+
+- [ ] **Step 5: Opt out `v3_migration.md:42` (intentionally-wrong import)**
+
+Change ```` ```python ```` to:
+
+````markdown
+```python exec="false" reason="intentionally shows the old/incorrect import for contrast"
+````
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add docs/contributing.md docs/user-guide/data_types.md docs/user-guide/examples/custom_dtype.md docs/user-guide/v3_migration.md
+git commit -m "docs: fix exec=on typo and explicitly opt out non-runnable blocks"
+```
+
+---
+
+## Task 9: Handle the dask block in performance.md
+
+`docs/user-guide/performance.md:263` uses `dask.array` and opens `'data/large_array.zarr'` (nonexistent). Two viable dispositions — pick based on whether `dask` is in the doctest env.
+
+**Files:**
+- Modify: `docs/user-guide/performance.md:263-280`
+
+- [ ] **Step 1: Check whether dask is available in the doctest env**
+
+Run: `hatch run doctest:list-env | grep -i dask`
+Expected: either shows a `dask` line (available) or nothing (not available).
+
+- [ ] **Step 2a: If dask IS available — make it runnable**
+
+Replace the `'data/large_array.zarr'` open with a created array, keeping the dask demonstration:
+
+````markdown
+```python exec="true" session="perf-dask" source="above"
+import zarr
+import dask.array as da
+
+zarr.config.set({
+    'async.concurrency': 4,
+    'threading.max_workers': 4,
+})
+
+# create a small array to read with Dask
+zarr.create_array("data/perf-dask-demo.zarr", shape=(16, 16), chunks=(8, 8), dtype="f4")
+z = zarr.open_array("data/perf-dask-demo.zarr", mode="r")
+
+arr = da.from_array(z, chunks=z.chunks)
+result = arr.mean(axis=0).compute()
+```
+````
+
+Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[performance.md:perf-dask]" -v`
+Expected: PASS
+
+- [ ] **Step 2b: If dask is NOT available — opt out with a reason**
+
+Change ```` ```python ```` to:
+
+````markdown
+```python exec="false" reason="requires dask, which is not in the docs test environment"
+````
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/user-guide/performance.md
+git commit -m "docs: make dask performance example runnable (or opt out if dask absent)"
+```
+
+---
+
+## Task 10: Add the guard test
+
+The guard asserts every python block in `docs/` is either `exec="true"` or `exec="false"` with a non-empty `reason`. Anything else (bare, `exec="on"`, missing reason) fails.
+
+**Files:**
+- Modify: `tests/test_docs.py`
+
+- [ ] **Step 1: Write the guard test**
+
+Add to `tests/test_docs.py`:
+
+```python
+def test_no_unvalidated_blocks() -> None:
+    """Every python code block in docs/ must declare its validation state:
+    either exec="true" (it is executed as a test) or exec="false" with a reason
+    (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on")
+    fails here, so a block can never silently opt out of validation — the gap
+    that hid the invalid create_array(mode="w") example in #4016."""
+    offenders: list[str] = []
+    for example in find_examples(str(DOCS_ROOT)):
+        settings = example.prefix_settings()
+        exec_val = settings.get("exec")
+        loc = f"{Path(example.path).relative_to(DOCS_ROOT)}:{example.start_line}"
+        if exec_val == "true":
+            continue
+        if exec_val == "false" and settings.get("reason", "").strip():
+            continue
+        offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})")
+
+    assert not offenders, (
+        "Docs python blocks must be exec=\"true\" or exec=\"false\" with a reason:\n"
+        + "\n".join(offenders)
+    )
+```
+
+(`find_examples` from pytest-examples only yields fenced code blocks for languages it recognizes as runnable, which includes python; confirm in Step 2 that the count matches the audit. If it also yields non-python fences, filter on `example.prefix` / language — adjust to `if not str(example.path).endswith(".md"): continue` is unnecessary since DOCS_ROOT is all markdown.)
+
+- [ ] **Step 2: Run the guard — it must PASS now that all blocks are remediated**
+
+Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v`
+Expected: PASS (zero offenders). **If it lists offenders**, they are blocks missed by Tasks 4-9 — fix each (turn on or opt out) until the list is empty.
+
+- [ ] **Step 3: Negative check — confirm the guard actually catches a bare block**
+
+Temporarily add a bare block to any docs file:
+
+````markdown
+```python
+1 / 0
+```
+````
+
+Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v`
+Expected: FAIL, listing the new bare block's location.
+
+Then remove the temporary block and re-run:
+Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v`
+Expected: PASS
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add tests/test_docs.py
+git commit -m "test: guard that every docs python block is executed or opted out (#4017)"
+```
+
+---
+
+## Task 11: Full suite + news fragment
+
+**Files:**
+- Create: `changes/4016.bugfix.md`
+
+- [ ] **Step 1: Run the entire docs test suite**
+
+Run: `hatch run doctest:test -v`
+Expected: PASS — all `exec="true"` sessions run (S3 against moto; config/cli/dask as applicable), the GPU session reports SKIPPED, and the guard passes.
+
+- [ ] **Step 2: Add the towncrier news fragment**
+
+Create `changes/4016.bugfix.md`:
+
+```markdown
+Fixed an invalid ``zarr.create_array`` example in the quick-start docs (it passed an unsupported ``mode`` argument) and made the cloud-storage example execute against a mock S3 backend in CI. Added a test ensuring every python code block in the docs is either executed or explicitly opted out with a documented reason.
+```
+
+- [ ] **Step 3: Run the full prek/lint pass**
+
+Run: `prek run --all-files`
+Expected: PASS (ruff, mypy, towncrier-check, etc. all green).
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add changes/4016.bugfix.md
+git commit -m "docs: add news fragment for docs-block validation (#4016, #4017)"
+```
+
+---
+
+## Self-review notes (resolved during planning)
+
+- **Spec coverage:** Part A (remediate 12 blocks) → Tasks 4-9; Part B (guard) → Task 10. Marker-bound execution (s3 + gpu) → Tasks 2, 3, 4, 7. Spike #1 → Task 1. `pyproject.toml` s3 marker → Task 2. All three spec spikes are addressed: #1 in Task 1; #2 (markdown-exec tolerance of `markers=`) is implicitly verified by `hatch run docs:build` — **add a build check**: see Task 11 Step 1 note below; #3 (moto teardown) handled by the fixture's `finally` block in Task 3 Step 5.
+- **Spike #2 verification:** `markers=` and `reason=`/`exec="false"` are unknown attributes to markdown-exec; it ignores unrecognized prefix settings and only acts on `exec="true"`. Confirm by running `hatch run docs:build` once after Task 11 and checking it succeeds and that the gpu/s3 blocks render as static source. If the build errors on unknown attributes, fall back to the per-session marker map (spec fallback for spike #2).
+- **The 12 blocks, accounted for:** quick-start S3 (T4), perf×2 config (T5), arrays config (T5), cli (T6), gpu (T7), contributing exec=on typo + pseudocode (T8), data_types transcript (T8), custom_dtype include (T8), v3_migration wrong-import (T8), perf dask (T9). = 12. ✓
+- **Naming consistency:** `_session_params`, `_markers_for`, `docs_s3_backend`, `test_no_unvalidated_blocks`, `S3_BUCKET="example-bucket"` (matches the URL in the T4 block) used consistently across tasks.

From 460385d207f4a1cdc421b30836c3fafd5a8bb7cf Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 10:54:57 +0200
Subject: [PATCH 05/25] test: spike s3 default-endpoint mechanism for docs (no
 storage_options)

Result: A-env-var

A bare zarr.create_array("s3://bucket/key", ...) with NO storage_options
reaches a moto server when AWS_ENDPOINT_URL is set process-wide. s3fs/
aiobotocore honor the env var, so the visible docs block can stay clean.

Caveat: moto[s3,server] currently lives only in the 'remote-tests'
dependency group, not 'test'; the doctest hatch env (dependency-groups=
['test']) does NOT have moto installed. The downstream real-fixture task
must add moto[s3,server] (and requests) to the doctest env extras.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

From 2032fee17d9180bc1f2478690f0a777910b852cf Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 10:58:02 +0200
Subject: [PATCH 06/25] test: register s3 pytest marker

---
 pyproject.toml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pyproject.toml b/pyproject.toml
index e342e8305c..65f3336073 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -446,6 +446,7 @@ filterwarnings = [
 markers = [
     "asyncio: mark test as asyncio test",
     "gpu: mark a test as requiring CuPy and GPU",
+    "s3: mark a test as requiring a (mock) S3 backend via moto",
     "slow_hypothesis: slow hypothesis tests",
 ]
 

From 42a8ebd118736615036afccf76ddfdec582b8ca9 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:02:54 +0200
Subject: [PATCH 07/25] test: parse markers= on docs blocks and add moto s3
 fixture binding

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 pyproject.toml     |   3 +-
 tests/test_docs.py | 120 +++++++++++++++++++++++++++++++++++----------
 2 files changed, 94 insertions(+), 29 deletions(-)

diff --git a/pyproject.toml b/pyproject.toml
index 65f3336073..bc95bfd61b 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -277,9 +277,8 @@ readthedocs = "rm -rf $READTHEDOCS_OUTPUT/html && cp -r site $READTHEDOCS_OUTPUT
 [tool.hatch.envs.doctest]
 description = "Test environment for validating executable code blocks in documentation"
 features = ['remote']
-dependency-groups = ['test']
+dependency-groups = ['remote-tests']
 extra-dependencies = [
-    "s3fs>=2023.10.0",
     "pytest-examples",
 ]
 
diff --git a/tests/test_docs.py b/tests/test_docs.py
index d467e478e8..e2fac43a32 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -7,14 +7,19 @@
 
 from __future__ import annotations
 
+import os
 from collections import defaultdict
 from pathlib import Path
+from typing import TYPE_CHECKING, Any
 
 import pytest
 
 pytest.importorskip("pytest_examples")
 from pytest_examples import CodeExample, EvalExample, find_examples
 
+if TYPE_CHECKING:
+    from collections.abc import Generator
+
 # Find all markdown files with executable code blocks
 DOCS_ROOT = Path(__file__).parent.parent / "docs"
 SOURCES_ROOT = Path(__file__).parent.parent / "src" / "zarr"
@@ -36,46 +41,104 @@ def find_markdown_files_with_exec() -> list[Path]:
     return sorted(markdown_files)
 
 
-def group_examples_by_session() -> list[tuple[str, str]]:
-    """
-    Group examples by their session and file, maintaining order.
+def name_example(path: str, session: str) -> str:
+    """Generate a readable name for a test case from file path and session."""
+    file = Path(path)
+    try:
+        file = file.relative_to(DOCS_ROOT)
+    except ValueError:
+        # Path is outside DOCS_ROOT (e.g. a tmp_path fixture in unit tests); use the
+        # bare file name rather than an absolute path for a stable, readable id.
+        file = Path(file.name)
+    return f"{file}:{session}"
 
-    Returns a list of session_key tuples where session_key is
-    (file_path, session_name).
-    """
-    all_examples = list(find_examples(DOCS_ROOT))
 
-    # Group by file and session
-    sessions = defaultdict(list)
+def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]:
+    """Translate a block's markers="a b" attribute into pytest mark decorators."""
+    raw = settings.get("markers", "")
+    return [getattr(pytest.mark, name) for name in raw.split() if name]
 
-    for example in all_examples:
+
+def _session_params(root: Path) -> list[Any]:
+    """Group exec="true" examples by (file, session) and emit one pytest.param per
+    session, carrying the union of markers declared by that session's blocks."""
+    sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list)
+    marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set)
+
+    for example in find_examples(str(root)):
         settings = example.prefix_settings()
         if settings.get("exec") != "true":
             continue
-
-        # Use file path and session name as key
-        file_path = example.path
         session_name = settings.get("session", "_default")
-        session_key = (str(file_path), session_name)
-
-        sessions[session_key].append(example)
-
-    # Return sorted list of session keys for consistent test ordering
-    return sorted(sessions.keys(), key=lambda x: (x[0], x[1]))
-
-
-def name_example(path: str, session: str) -> str:
-    """Generate a readable name for a test case from file path and session."""
-    return f"{Path(path).relative_to(DOCS_ROOT)}:{session}"
+        key = (str(example.path), session_name)
+        sessions[key].append(example)
+        for mark in _markers_for(settings):
+            marks_by_session[key].add(mark.name)
+
+    params = []
+    for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])):
+        marks = tuple(getattr(pytest.mark, name) for name in sorted(marks_by_session[key]))
+        params.append(pytest.param(key, marks=marks, id=name_example(key[0], key[1])))
+    return params
+
+
+S3_PORT = 5556
+S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/"
+S3_BUCKET = "example-bucket"
+
+
+@pytest.fixture
+def docs_s3_backend() -> Generator[None, None, None]:
+    """Stand up a moto mock-S3 server and set a process-wide default endpoint so docs
+    blocks can use a bare s3:// URL with no storage_options (see spike in plan Task 1)."""
+    moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
+    s3fs = pytest.importorskip("s3fs")
+    botocore = pytest.importorskip("botocore")
+    requests = pytest.importorskip("requests")
+
+    prev_endpoint = os.environ.get("AWS_ENDPOINT_URL")
+    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT)
+    server.start()
+    try:
+        os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
+        os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
+        os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT
+
+        session = botocore.session.Session()
+        client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1")
+        client.create_bucket(Bucket=S3_BUCKET)
+        client.close()
+        s3fs.S3FileSystem.clear_instance_cache()
+        yield
+    finally:
+        requests.post(f"{S3_ENDPOINT}/moto-api/reset")
+        if prev_endpoint is None:
+            os.environ.pop("AWS_ENDPOINT_URL", None)
+        else:
+            os.environ["AWS_ENDPOINT_URL"] = prev_endpoint
+        server.stop()
+
+
+def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
+    """A block tagged markers="s3" must surface that marker on its parametrized case,
+    so pytest can gate/bind it (e.g. attach the moto fixture)."""
+    md = tmp_path / "ex.md"
+    md.write_text(
+        '```python exec="true" session="demo" markers="s3"\nimport zarr\n```\n',
+        encoding="utf-8",
+    )
+    params = _session_params(md.parent)
+    assert len(params) == 1
+    marks = params[0].marks
+    assert any(m.name == "s3" for m in marks)
 
 
 # Get all example sessions
-@pytest.mark.parametrize(
-    "session_key", group_examples_by_session(), ids=lambda v: name_example(v[0], v[1])
-)
+@pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT))
 def test_documentation_examples(
     session_key: tuple[str, str],
     eval_example: EvalExample,
+    request: pytest.FixtureRequest,
 ) -> None:
     """
     Test that all exec="true" code examples in documentation execute successfully.
@@ -90,6 +153,9 @@ def test_documentation_examples(
     - Execute them in order within the same context
     - Verify no exceptions are raised
     """
+    if request.node.get_closest_marker("s3") is not None:
+        request.getfixturevalue("docs_s3_backend")
+
     file_path, session_name = session_key
 
     # Get examples for this session

From 7489bea33e2648e3dad43ccabefc6b662eb5e65c Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:09:04 +0200
Subject: [PATCH 08/25] docs: fix invalid s3 create_array example and run it
 against moto (#4016)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/quick-start.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/quick-start.md b/docs/quick-start.md
index 27dc8e6045..efde321af6 100644
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -131,11 +131,13 @@ Zarr integrates seamlessly with cloud object storage such as Amazon S3 and Googl
 using external libraries like [s3fs](https://s3fs.readthedocs.io) or
 [gcsfs](https://gcsfs.readthedocs.io):
 
-```python
-
-import s3fs
+```python exec="true" session="s3demo" markers="s3" source="above"
+import zarr
+import numpy as np
 
-z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10), dtype="f4")
+z = zarr.create_array(
+    "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4"
+)
 z[:, :] = np.random.random((100, 100))
 ```
 

From 198eb802396972628fb0585d31e0ecf616d6854a Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:11:27 +0200
Subject: [PATCH 09/25] docs: execute config-setting examples in performance.md
 and arrays.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/user-guide/arrays.md      | 2 +-
 docs/user-guide/performance.md | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/user-guide/arrays.md b/docs/user-guide/arrays.md
index 14122003c0..dd1788b7d2 100644
--- a/docs/user-guide/arrays.md
+++ b/docs/user-guide/arrays.md
@@ -619,7 +619,7 @@ Without the `shards` argument, there would be 10,000 chunks stored as individual
     Because the feature is still stabilizing, it is disabled by default and
     must be explicitly enabled:
 
-    ```python
+    ```python exec="true" session="arrays-rectilinear"
     import zarr
     zarr.config.set({"array.rectilinear_chunks": True})
     ```
diff --git a/docs/user-guide/performance.md b/docs/user-guide/performance.md
index fa98e9466e..f22ec00d02 100644
--- a/docs/user-guide/performance.md
+++ b/docs/user-guide/performance.md
@@ -204,7 +204,7 @@ determines the maximum number of concurrent I/O operations.
 The default value is 10, which is a conservative value. You may get improved performance by tuning
 the concurrency limit. You can adjust this value based on your specific needs:
 
-```python
+```python exec="true" session="perf-concurrency"
 import zarr
 
 # Set concurrency for the current session
@@ -234,7 +234,7 @@ By default it is `None`, which lets Python choose the pool size (typically
 
 You can set it explicitly when you want more predictable resource usage:
 
-```python
+```python exec="true" session="perf-workers"
 import zarr
 
 zarr.config.set({'threading.max_workers': 8})

From c8836c35f17cc2f21ba039f78b8a96e7499cc254 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:13:29 +0200
Subject: [PATCH 10/25] docs: make cli zarr.open example runnable against a
 local store

---
 docs/user-guide/cli.md | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/docs/user-guide/cli.md b/docs/user-guide/cli.md
index fc812c1a20..743392f679 100644
--- a/docs/user-guide/cli.md
+++ b/docs/user-guide/cli.md
@@ -45,9 +45,13 @@ This will write new `zarr.json` files to `input.zarr`, leaving the existing v2 m
 
 To open the array/group using the new metadata use:
 
-```python
+```python exec="true" session="cli-open" source="above"
 import zarr
-zarr_with_v3_metadata = zarr.open('path/to/input.zarr', zarr_format=3)
+
+# create a small array to open (stands in for the migrated store)
+zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4")
+
+zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3)
 ```
 
 Once you are happy with the conversion, you can run the following to remove the old v2 metadata:

From 010d99ace52b5c54fbd98d98327c65245828e2c2 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:17:48 +0200
Subject: [PATCH 11/25] docs: execute gpu example under the gpu marker

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/user-guide/gpu.md | 2 +-
 tests/test_docs.py     | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/docs/user-guide/gpu.md b/docs/user-guide/gpu.md
index 6189f39d3d..1dc3ef296b 100644
--- a/docs/user-guide/gpu.md
+++ b/docs/user-guide/gpu.md
@@ -16,7 +16,7 @@ Zarr can use GPUs to accelerate your workload by running `zarr.Config.enable_gpu
 [`zarr.config`][] configures Zarr to use GPU memory for the data
 buffers used internally by Zarr via `enable_gpu()`.
 
-```python
+```python exec="true" session="gpu-demo" markers="gpu" source="above"
 import zarr
 import cupy as cp
 zarr.config.enable_gpu()
diff --git a/tests/test_docs.py b/tests/test_docs.py
index e2fac43a32..383456a30f 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -153,6 +153,9 @@ def test_documentation_examples(
     - Execute them in order within the same context
     - Verify no exceptions are raised
     """
+    if request.node.get_closest_marker("gpu") is not None:
+        pytest.importorskip("cupy")
+
     if request.node.get_closest_marker("s3") is not None:
         request.getfixturevalue("docs_s3_backend")
 

From 79197c4c39a870146f5e54aad2592beed8ed8992 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:23:47 +0200
Subject: [PATCH 12/25] docs: fix exec=on typo and explicitly opt out
 non-runnable blocks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/contributing.md                     | 6 +++---
 docs/user-guide/data_types.md            | 2 +-
 docs/user-guide/examples/custom_dtype.md | 2 +-
 docs/user-guide/v3_migration.md          | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/contributing.md b/docs/contributing.md
index b9c7aa1aa2..a37768b815 100644
--- a/docs/contributing.md
+++ b/docs/contributing.md
@@ -12,7 +12,7 @@ If you find a bug, please raise a [GitHub issue](https://github.com/zarr-develop
 
 1. A minimal, self-contained snippet of Python code reproducing the problem. You can format the code nicely using markdown, e.g.:
 
-```python
+```python exec="false" reason="illustrative pseudocode with a '# etc.' placeholder, not runnable"
 import zarr
 g = zarr.group()
 # etc.
@@ -225,10 +225,10 @@ hatch --env docs run serve
 
 #### Adding executable code blocks in the documentation
 
-Zarr uses [Markdown Exec](https://pawamoy.github.io/markdown-exec/usage/) to execute code blocks in Markdown files. Add `exec="on"` to a code block header for it to be executed when the docs are built. For example:
+Zarr uses [Markdown Exec](https://pawamoy.github.io/markdown-exec/usage/) to execute code blocks in Markdown files. Add `exec="true"` to a code block header for it to be executed when the docs are built. For example:
 
 ````md
-```python exec="on"
+```python exec="true"
 print("Hello world")
 ```
 ````
diff --git a/docs/user-guide/data_types.md b/docs/user-guide/data_types.md
index 3e10845979..6f6bb05033 100644
--- a/docs/user-guide/data_types.md
+++ b/docs/user-guide/data_types.md
@@ -360,7 +360,7 @@ print(type(a.dtype))
 
 But if we inspect the metadata for the array, we can see the Zarr data type object:
 
-```python
+```python exec="false" reason="REPL output transcript, not executable source"
 type(a.metadata.data_type)
 <class 'zarr.core.dtype.npy.int.Int64'>
 ```
diff --git a/docs/user-guide/examples/custom_dtype.md b/docs/user-guide/examples/custom_dtype.md
index d6736e25dd..391407b822 100644
--- a/docs/user-guide/examples/custom_dtype.md
+++ b/docs/user-guide/examples/custom_dtype.md
@@ -2,6 +2,6 @@
 
 ## Source Code
 
-```python
+```python exec="false" reason="pymdownx snippet include directive, not python source"
 --8<-- "examples/custom_dtype/custom_dtype.py"
 ```
diff --git a/docs/user-guide/v3_migration.md b/docs/user-guide/v3_migration.md
index 21386c1522..1680547d93 100644
--- a/docs/user-guide/v3_migration.md
+++ b/docs/user-guide/v3_migration.md
@@ -39,7 +39,7 @@ the following actions in order:
    - `numcodecs.*` will no longer be available in `zarr.*`. To migrate, import codecs
      directly from `numcodecs`:
 
-     ```python
+     ```python exec="false" reason="intentionally shows the old/incorrect import for contrast"
      from numcodecs import Blosc
      # instead of:
      # from zarr import Blosc

From 9165bd5896a1a79f90ff49e9e00f52c2f1764ead Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:28:31 +0200
Subject: [PATCH 13/25] docs: make dask performance example runnable (or opt
 out if dask absent)

---
 docs/user-guide/performance.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/user-guide/performance.md b/docs/user-guide/performance.md
index f22ec00d02..3357913557 100644
--- a/docs/user-guide/performance.md
+++ b/docs/user-guide/performance.md
@@ -260,7 +260,7 @@ For example, if you're running Dask with 10 threads and Zarr's default concurren
 
 **Recommendation**: When using Dask with many threads, configure Zarr's concurrency settings:
 
-```python
+```python exec="false" reason="requires dask, which is not in the docs test environment"
 import zarr
 import dask.array as da
 

From 19f0238dbcdb538521f0150e72cf782f16c8d00a Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:28:52 +0200
Subject: [PATCH 14/25] docs: record plan corrections from execution (spike
 result, gpu marker mechanism)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../plans/2026-05-29-docs-block-validation.md | 31 +++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/docs/superpowers/plans/2026-05-29-docs-block-validation.md b/docs/superpowers/plans/2026-05-29-docs-block-validation.md
index 1c34652012..63ab0b696d 100644
--- a/docs/superpowers/plans/2026-05-29-docs-block-validation.md
+++ b/docs/superpowers/plans/2026-05-29-docs-block-validation.md
@@ -129,6 +129,20 @@ git commit -m "test: spike s3 default-endpoint mechanism for docs (no storage_op
 Result: <env-var | fsspec-config | fallback-to-storage_options>"
 ```
 
+### RESULT (completed 2026-05-29, commit 460385d)
+
+- **Mechanism A won:** `os.environ["AWS_ENDPOINT_URL"] = ENDPOINT` (+ dummy
+  `AWS_SECRET_ACCESS_KEY`/`AWS_ACCESS_KEY_ID`) makes a bare
+  `create_array("s3://...")` reach moto with no `storage_options`. `clear_instance_cache()`
+  alone sufficed; the `set_session()`/`skip_instance_cache` dance from `test_fsspec.py`
+  was not needed for a single fixture. No event-loop or teardown warnings observed.
+- **PLAN CORRECTION (important):** the `doctest` hatch env does **NOT** install moto.
+  `moto[s3,server]` is only in the `remote-tests` dependency group; the `doctest` env
+  (`pyproject.toml` ~line 277-284) has only `s3fs` + `pytest-examples` as extras. **Task 3
+  MUST add `moto[s3,server]` and `requests` to `[tool.hatch.envs.doctest] extra-dependencies`**,
+  or any moto-backed doctest will silently `importorskip`-skip — defeating the purpose.
+  Task 4's verification must assert the S3 case **runs**, not skips.
+
 ---
 
 ## Task 2: Register the `s3` pytest marker
@@ -484,10 +498,23 @@ Change the fence from ```` ```python ```` to:
 
 (Body unchanged: `import cupy as cp`, `zarr.config.enable_gpu()`, `create_array("memory://gpu-demo", ...)`, etc.)
 
+> **PLAN CORRECTION (found during execution, commit 010d99a):** a registered marker
+> does NOT auto-skip a test under plain `pytest` — markers only *filter* when you pass
+> `-m`. Without a guard, the gpu block runs and FAILS with `ModuleNotFoundError: cupy`
+> (cupy is darwin-excluded). The repo's real convention is `pytest.importorskip("cupy")`
+> in the test body (cf. `tests/conftest.py:183`). So Task 7 also adds to
+> `test_documentation_examples` (mirroring the `s3` binding):
+> ```python
+>     if request.node.get_closest_marker("gpu") is not None:
+>         pytest.importorskip("cupy")
+> ```
+> This converts the missing-cupy hard error into a proper SKIP in the default env, while
+> `-m gpu` in the `gputest` env still collects+runs it on real hardware.
+
 - [ ] **Step 2: Confirm it is SKIPPED in the default doctest env (no GPU)**
 
-Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[gpu.md:gpu-demo]" -v`
-Expected: SKIPPED (the `gpu` marker is deselected without `-m gpu`), **not** an error, **not** absent.
+Run: `hatch -e doctest run pytest "tests/test_docs.py::test_documentation_examples[user-guide/gpu.md:gpu-demo]" -v`
+Expected: SKIPPED (via `importorskip("cupy")`), **not** an error, **not** absent.
 
 - [ ] **Step 3: Confirm it is COLLECTED for the gpu selection**
 

From f90c2b012ef5780d30d1a40f58afb76db8f41432 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 11:32:09 +0200
Subject: [PATCH 15/25] test: guard that every docs python block is executed or
 opted out (#4017)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/test_docs.py | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tests/test_docs.py b/tests/test_docs.py
index 383456a30f..21d9b3f702 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -133,6 +133,35 @@ def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
     assert any(m.name == "s3" for m in marks)
 
 
+def test_no_unvalidated_blocks() -> None:
+    """Every python code block in docs/ must declare its validation state:
+    either exec="true" (it is executed as a test) or exec="false" with a reason
+    (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on")
+    fails here, so a block can never silently opt out of validation -- the gap
+    that hid the invalid create_array(mode="w") example in #4016."""
+    offenders: list[str] = []
+    for example in find_examples(str(DOCS_ROOT)):
+        rel = Path(example.path).relative_to(DOCS_ROOT)
+        # docs/superpowers/ holds design-doc caches (plans/specs), not published
+        # documentation -- it is not in the mkdocs nav -- so its illustrative
+        # fences are not subject to the execution guard.
+        if rel.parts and rel.parts[0] == "superpowers":
+            continue
+        settings = example.prefix_settings()
+        exec_val = settings.get("exec")
+        loc = f"{rel}:{example.start_line}"
+        if exec_val == "true":
+            continue
+        if exec_val == "false" and settings.get("reason", "").strip():
+            continue
+        offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})")
+
+    assert not offenders, (
+        'Docs python blocks must be exec="true" or exec="false" with a reason:\n'
+        + "\n".join(offenders)
+    )
+
+
 # Get all example sessions
 @pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT))
 def test_documentation_examples(

From cfa566542a95f181540d9ea250275db865bad037 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 12:13:39 +0200
Subject: [PATCH 16/25] docs: separate `test` flag from `exec` so infra-bound
 examples don't break the build

markdown-exec's `exec="true"` means "run at build to render output"; build runners
have no GPU/cupy and no moto server, so tagging the GPU/S3 examples exec="true" made
`mkdocs build --strict` abort. Introduce a separate `test="true"` flag that our
tests/test_docs.py harness keys on (markdown-exec ignores it): a block is validated if
exec="true" OR test="true". The GPU and S3 examples become test="true" (+markers) and
are no longer run at build.

Also: a test="true"-only python fence placed before an exec="true" block of the same
page disrupts markdown-exec's build execution of the later block (the quickstart
ZipStore example failed with FileNotFoundError). Move the S3 example to the end of
quick-start.md so no shared-session exec block follows it; document the constraint in
the guard docstring and the design spec.

Verified: full docs test suite green (57 passed, 2 skipped), `mkdocs build --strict`
exits 0, prek (ruff/mypy/...) clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/quick-start.md                           | 28 +++---
 ...2026-05-29-docs-block-validation-design.md | 71 ++++++++++++----
 docs/user-guide/gpu.md                        |  2 +-
 tests/test_docs.py                            | 85 +++++++++++--------
 4 files changed, 119 insertions(+), 67 deletions(-)

diff --git a/docs/quick-start.md b/docs/quick-start.md
index efde321af6..0bad4f2e34 100644
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -127,20 +127,6 @@ be done in a separate step.
 Zarr supports persistent storage to disk or cloud-compatible backends. While examples above
 utilized a [`zarr.storage.LocalStore`][], a number of other storage options are available.
 
-Zarr integrates seamlessly with cloud object storage such as Amazon S3 and Google Cloud Storage
-using external libraries like [s3fs](https://s3fs.readthedocs.io) or
-[gcsfs](https://gcsfs.readthedocs.io):
-
-```python exec="true" session="s3demo" markers="s3" source="above"
-import zarr
-import numpy as np
-
-z = zarr.create_array(
-    "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4"
-)
-z[:, :] = np.random.random((100, 100))
-```
-
 A single-file store can also be created using the [`zarr.storage.ZipStore`][]:
 
 ```python exec="true" session="quickstart" source="above"
@@ -175,4 +161,18 @@ z = zarr.open_array(store, mode='r')
 print(z[:])
 ```
 
+Zarr also integrates seamlessly with cloud object storage such as Amazon S3 and Google
+Cloud Storage using external libraries like [s3fs](https://s3fs.readthedocs.io) or
+[gcsfs](https://gcsfs.readthedocs.io):
+
+```python test="true" session="s3demo" markers="s3" source="above"
+import zarr
+import numpy as np
+
+z = zarr.create_array(
+    "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4"
+)
+z[:, :] = np.random.random((100, 100))
+```
+
 Read more about Zarr's storage options in the [User Guide](user-guide/index.md).
diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
index 4e8232846f..643c454741 100644
--- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
+++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
@@ -100,13 +100,54 @@ marker-bound execution look hard:
 - **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is
   where markers live, where infra fixtures bind, and where env-gating happens.
 
+## Two flags: `exec` (render output) vs `test` (validate)
+
+A code block can be *run* for two unrelated reasons, and conflating them breaks the
+build. They are separate fence attributes:
+
+- **`exec="true"`** — markdown-exec executes the block **at docs-build time to render its
+  output** into the published page. This is markdown-exec's own attribute (it hard-codes
+  the name `exec`, see `markdown_exec/_internal/main.py`), so we cannot rename it. Read it
+  as *"execute to render output."*
+- **`test="true"`** — **our** `tests/test_docs.py` harness executes the block **as a
+  validation test**. markdown-exec does not recognize `test=` and ignores it.
+
+Why two: a block that needs special infra to run (GPU/cupy, or S3) must be **validated in
+tests** but must **not run at build** — build runners have no GPU and no moto server, so
+an `exec="true"` GPU block makes `mkdocs build --strict` abort (`ModuleNotFoundError:
+cupy`). Separating the flags lets such a block be `test="true"` (tested) without
+`exec="true"` (so it renders as static source at build, never executed there).
+
+**Harness rule:** a block is collected as a test if `exec="true"` **OR** `test="true"`.
+So existing `exec="true"` example blocks stay tested as before (backward-compatible), and
+test-only blocks add `test="true"` without `exec`.
+
+**The combinations:**
+
+| Block | `exec` | `test` | Effect |
+|---|---|---|---|
+| Tutorial examples (quickstart, config, …) | `true` | — | Run at build (render output); also tested. |
+| GPU / S3 examples | — | `true` | Tested (under markers); rendered static at build. |
+| Non-runnable (transcript, include, wrong-import) | `false`+`reason` | — | Neither; explicit reasoned opt-out. |
+
+**Placement constraint (markdown-exec quirk).** markdown-exec's SuperFences validator
+*rejects* a `python` fence that lacks `exec="true"` (returns `False`, so it is not run at
+build). A rejected fence positioned **before** an `exec="true"` block of the same page
+disrupts markdown-exec's build-time execution of that later block — observed concretely: a
+`test="true"` S3 block placed above the quickstart `ZipStore` example made the ZipStore
+block fail at build (`FileNotFoundError`, the zip was never written) and `mkdocs build
+--strict` aborted. Fix: **a `test="true"`-only block must come last on its page** (or be
+the only python block on the page, as on `gpu.md`). The S3 example is therefore placed at
+the end of `quick-start.md`. The guard test docstring records this.
+
 ## Marker-bound execution
 
-A block declares the pytest marker it needs via a **fence attribute**, e.g.:
+A block declares the pytest marker it needs via a **fence attribute**. Marker-bound
+blocks are `test="true"` (validated) but **not** `exec="true"` (not build-run), e.g.:
 
 ````
-```python exec="true" markers="gpu"
-```python exec="true" markers="s3"
+```python test="true" markers="gpu" source="above"
+```python test="true" markers="s3" source="above"
 ````
 
 `group_examples_by_session()` parses `markers=` and emits
@@ -114,11 +155,12 @@ A block declares the pytest marker it needs via a **fence attribute**, e.g.:
 whatever that marker means** — and the two markers mean different things, which is the
 point of unifying the model rather than special-casing each:
 
-- **`gpu` — env-gate.** Default `doctest` env runs `pytest` → the gpu-marked param is
-  **skipped/deselected**, exactly like every other `gpu`-marked test in `tests/`. The
-  `gputest` env runs `pytest -m gpu` → the param **executes** against real cupy. Reuses
-  the existing `gpu` marker (`pyproject.toml` `markers` table) and `pytest -m gpu`
-  selection — no new harness concept.
+- **`gpu` — env-gate.** A registered marker does **not** auto-skip under plain `pytest`
+  (markers only *filter* when you pass `-m`). The repo's convention is
+  `pytest.importorskip("cupy")` in the test body (cf. `tests/conftest.py`), so the harness
+  calls `importorskip("cupy")` for gpu-marked docs cases: in the default `doctest` env the
+  case is **skipped** (no cupy), and `pytest -m gpu` in the `gputest` env runs it on real
+  cupy. The block is `test="true"` (not `exec="true"`), so it is never run at build.
 
 - **`s3` — infra-binding.** A new `s3` marker (must be registered in the `markers`
   table). An autouse-style fixture keyed on the marker stands up the `moto` server and
@@ -135,16 +177,15 @@ hardware env, s3 → fixture), not in the declaration mechanism.
 ## Components & data flow
 
 **`docs/` markdown** — source of truth. Each python block is in one of three declared
-states:
+states (see the two-flags table above):
 
-1. `exec="true"` (optionally `+ markers="<m>"`) — executed as a test.
-2. explicit opt-out marker **with a reason** — deliberately not executed.
+1. `exec="true"` and/or `test="true"` (optionally `+ markers="<m>"`) — validated, by
+   build-render and/or by the test harness.
+2. `exec="false"` with a `reason="..."` — explicit, documented opt-out.
 3. anything else (bare, `exec="on"`, …) — **illegal**, fails the guard.
 
-The exact spelling of the opt-out marker (e.g. `exec="false"` plus a `reason="..."`
-attribute, versus a dedicated sentinel attribute) is an implementation-plan decision.
-Requirement: it must be explicit, greppable, carry a human-readable reason, and be a
-form `markdown-exec` will not execute at build time.
+The opt-out form is `exec="false" reason="..."`: explicit, greppable, carries a
+human-readable reason, and is not executed by markdown-exec at build time.
 
 **`tests/test_docs.py`** — already-parametrized pytest harness. Changes:
 
diff --git a/docs/user-guide/gpu.md b/docs/user-guide/gpu.md
index 1dc3ef296b..6c26c3e564 100644
--- a/docs/user-guide/gpu.md
+++ b/docs/user-guide/gpu.md
@@ -16,7 +16,7 @@ Zarr can use GPUs to accelerate your workload by running `zarr.Config.enable_gpu
 [`zarr.config`][] configures Zarr to use GPU memory for the data
 buffers used internally by Zarr via `enable_gpu()`.
 
-```python exec="true" session="gpu-demo" markers="gpu" source="above"
+```python test="true" session="gpu-demo" markers="gpu" source="above"
 import zarr
 import cupy as cp
 zarr.config.enable_gpu()
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 21d9b3f702..28bf2b9f6c 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -1,8 +1,12 @@
 """
 Tests for executable code blocks in markdown documentation.
 
-This module uses pytest-examples to validate that all Python code examples
-with exec="true" in the documentation execute successfully.
+This module uses pytest-examples to validate Python code examples in the docs. A block is
+validated if it renders output at build (exec="true") or is explicitly marked for testing
+(test="true"); see the two-flags discussion in
+docs/superpowers/specs/2026-05-29-docs-block-validation-design.md. The test_no_unvalidated_blocks
+guard ensures every python block declares one of those, or an explicit exec="false" opt-out
+with a reason, so a block can never silently skip validation.
 """
 
 from __future__ import annotations
@@ -20,27 +24,10 @@
 if TYPE_CHECKING:
     from collections.abc import Generator
 
-# Find all markdown files with executable code blocks
 DOCS_ROOT = Path(__file__).parent.parent / "docs"
 SOURCES_ROOT = Path(__file__).parent.parent / "src" / "zarr"
 
 
-def find_markdown_files_with_exec() -> list[Path]:
-    """Find all markdown files containing exec="true" code blocks."""
-    markdown_files = []
-
-    for md_file in DOCS_ROOT.rglob("*.md"):
-        try:
-            content = md_file.read_text(encoding="utf-8")
-            if 'exec="true"' in content:
-                markdown_files.append(md_file)
-        except Exception:
-            # Skip files that can't be read
-            continue
-
-    return sorted(markdown_files)
-
-
 def name_example(path: str, session: str) -> str:
     """Generate a readable name for a test case from file path and session."""
     file = Path(path)
@@ -59,15 +46,25 @@ def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]:
     return [getattr(pytest.mark, name) for name in raw.split() if name]
 
 
+def _is_tested(settings: dict[str, str]) -> bool:
+    """A block is validated by our pytest harness if it is run at build to render output
+    (exec="true") OR explicitly marked for testing (test="true"). The two flags are
+    separate on purpose: exec= drives markdown-exec's build-time rendering, while test=
+    lets a block be validated without being run at build (e.g. gpu/s3 blocks, which the
+    build environment cannot run)."""
+    return settings.get("exec") == "true" or settings.get("test") == "true"
+
+
 def _session_params(root: Path) -> list[Any]:
-    """Group exec="true" examples by (file, session) and emit one pytest.param per
-    session, carrying the union of markers declared by that session's blocks."""
+    """Group tested examples (exec="true" or test="true") by (file, session) and emit one
+    pytest.param per session, carrying the union of markers declared by that session's
+    blocks."""
     sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list)
     marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set)
 
     for example in find_examples(str(root)):
         settings = example.prefix_settings()
-        if settings.get("exec") != "true":
+        if not _is_tested(settings):
             continue
         session_name = settings.get("session", "_default")
         key = (str(example.path), session_name)
@@ -120,11 +117,13 @@ def docs_s3_backend() -> Generator[None, None, None]:
 
 
 def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
-    """A block tagged markers="s3" must surface that marker on its parametrized case,
-    so pytest can gate/bind it (e.g. attach the moto fixture)."""
+    """A test="true" block tagged markers="s3" must surface that marker on its
+    parametrized case, so pytest can gate/bind it (e.g. attach the moto fixture).
+    Uses test="true" (not exec="true") because marker-bound blocks are validated by the
+    harness without being run at build time."""
     md = tmp_path / "ex.md"
     md.write_text(
-        '```python exec="true" session="demo" markers="s3"\nimport zarr\n```\n',
+        '```python test="true" session="demo" markers="s3"\nimport zarr\n```\n',
         encoding="utf-8",
     )
     params = _session_params(md.parent)
@@ -134,11 +133,16 @@ def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
 
 
 def test_no_unvalidated_blocks() -> None:
-    """Every python code block in docs/ must declare its validation state:
-    either exec="true" (it is executed as a test) or exec="false" with a reason
-    (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on")
-    fails here, so a block can never silently opt out of validation -- the gap
-    that hid the invalid create_array(mode="w") example in #4016."""
+    """Every python code block in docs/ must declare its validation state: exec="true"
+    (run at build to render output), test="true" (validated by this harness without being
+    run at build), or exec="false" with a reason (explicit, documented opt-out). A bare or
+    mistyped fence (e.g. exec="on") fails here, so a block can never silently opt out of
+    validation -- the gap that hid the invalid create_array(mode="w") example in #4016.
+
+    Note on placement: a test="true"-only block (which markdown-exec does not execute)
+    must not sit *before* an exec="true" block of the same page's session, or it disrupts
+    markdown-exec's build-time execution of the later block. Keep test-only blocks last on
+    the page (or on a page where they are the only python block, like gpu.md)."""
     offenders: list[str] = []
     for example in find_examples(str(DOCS_ROOT)):
         rel = Path(example.path).relative_to(DOCS_ROOT)
@@ -150,15 +154,21 @@ def test_no_unvalidated_blocks() -> None:
         settings = example.prefix_settings()
         exec_val = settings.get("exec")
         loc = f"{rel}:{example.start_line}"
-        if exec_val == "true":
+        # Validated either by build-render (exec="true") or by the test harness
+        # (test="true").
+        if _is_tested(settings):
             continue
+        # Explicit, documented opt-out from execution.
         if exec_val == "false" and settings.get("reason", "").strip():
             continue
-        offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})")
+        offenders.append(
+            f"{loc} (exec={exec_val!r}, test={settings.get('test')!r}, "
+            f"reason={settings.get('reason')!r})"
+        )
 
     assert not offenders, (
-        'Docs python blocks must be exec="true" or exec="false" with a reason:\n'
-        + "\n".join(offenders)
+        'Docs python blocks must be exec="true", test="true", or exec="false" with a '
+        "reason:\n" + "\n".join(offenders)
     )
 
 
@@ -170,14 +180,15 @@ def test_documentation_examples(
     request: pytest.FixtureRequest,
 ) -> None:
     """
-    Test that all exec="true" code examples in documentation execute successfully.
+    Test that all validated code examples (exec="true" or test="true") in documentation
+    execute successfully.
 
     This test groups examples by session (file + session name) and runs them
     sequentially in the same execution context, allowing code to build on
     previous examples.
 
     This test uses pytest-examples to:
-    - Find all code examples with exec="true" in markdown files
+    - Find all code examples marked exec="true" or test="true" in markdown files
     - Group them by session
     - Execute them in order within the same context
     - Verify no exceptions are raised
@@ -195,7 +206,7 @@ def test_documentation_examples(
     examples = []
     for example in all_examples:
         settings = example.prefix_settings()
-        if settings.get("exec") != "true":
+        if not _is_tested(settings):
             continue
         if str(example.path) == file_path and settings.get("session", "_default") == session_name:
             examples.append(example)

From 9c74830f1107730a034aa5cc8e84f303d1c59209 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 12:13:59 +0200
Subject: [PATCH 17/25] docs: add news fragment for docs-block validation
 (#4016, #4017)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 changes/4016.bugfix.md | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 changes/4016.bugfix.md

diff --git a/changes/4016.bugfix.md b/changes/4016.bugfix.md
new file mode 100644
index 0000000000..01984110f7
--- /dev/null
+++ b/changes/4016.bugfix.md
@@ -0,0 +1 @@
+Fixed an invalid `zarr.create_array` example in the quick-start documentation (it passed an unsupported `mode` argument) and made the cloud-storage example execute against a mock S3 backend in CI. Added a test ensuring every Python code block in the documentation is either executed or explicitly opted out with a documented reason, so an invalid example can no longer go untested.

From 2ad7474130957df8a68d69847caedd3b400b6ae2 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 12:25:05 +0200
Subject: [PATCH 18/25] test: harden docs_s3_backend teardown and make cli
 example idempotent

Address roborev branch-review findings (job 186):
- Medium: moto-api reset POST ran before server.stop()/env restore in the
  finally block; if it raised, the fixed-port server thread leaked and
  AWS_ENDPOINT_URL was left dangling. Nest the reset in its own try/finally so
  server.stop() and env restoration always run.
- Low: f"{S3_ENDPOINT}/moto-api/reset" double-slashed (constant ends in /);
  drop the extra slash.
- Low: AWS_SECRET_ACCESS_KEY/AWS_ACCESS_KEY_ID were setdefault'd but never
  restored; save/restore all three mutated env vars uniformly.
- Low: cli.md create_array lacked overwrite=True, non-idempotent across local
  runs; add it.

Verified: full docs suite green (57 passed, 2 skipped), s3+cli pass on repeated
runs, mkdocs build --strict exits 0, prek clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/user-guide/cli.md |  2 +-
 tests/test_docs.py     | 22 +++++++++++++++-------
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/docs/user-guide/cli.md b/docs/user-guide/cli.md
index 743392f679..13fcb6f1b6 100644
--- a/docs/user-guide/cli.md
+++ b/docs/user-guide/cli.md
@@ -49,7 +49,7 @@ To open the array/group using the new metadata use:
 import zarr
 
 # create a small array to open (stands in for the migrated store)
-zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4")
+zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4", overwrite=True)
 
 zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3)
 ```
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 28bf2b9f6c..11ba8836ec 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -93,7 +93,9 @@ def docs_s3_backend() -> Generator[None, None, None]:
     botocore = pytest.importorskip("botocore")
     requests = pytest.importorskip("requests")
 
-    prev_endpoint = os.environ.get("AWS_ENDPOINT_URL")
+    # Save every env var we mutate so teardown can restore the prior process state.
+    env_keys = ("AWS_ENDPOINT_URL", "AWS_SECRET_ACCESS_KEY", "AWS_ACCESS_KEY_ID")
+    prev_env = {key: os.environ.get(key) for key in env_keys}
     server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT)
     server.start()
     try:
@@ -108,12 +110,18 @@ def docs_s3_backend() -> Generator[None, None, None]:
         s3fs.S3FileSystem.clear_instance_cache()
         yield
     finally:
-        requests.post(f"{S3_ENDPOINT}/moto-api/reset")
-        if prev_endpoint is None:
-            os.environ.pop("AWS_ENDPOINT_URL", None)
-        else:
-            os.environ["AWS_ENDPOINT_URL"] = prev_endpoint
-        server.stop()
+        # Cleanup must always run, even if the moto-api reset POST fails: stopping the
+        # server frees the fixed port and restoring the env avoids leaking state (and a
+        # stale AWS_ENDPOINT_URL) into the rest of the session.
+        try:
+            requests.post(f"{S3_ENDPOINT}moto-api/reset")
+        finally:
+            for key, value in prev_env.items():
+                if value is None:
+                    os.environ.pop(key, None)
+                else:
+                    os.environ[key] = value
+            server.stop()
 
 
 def test_markers_attribute_is_parsed(tmp_path: Path) -> None:

From 31fa4d7b7c1aaa204b516b7aa6d2098e88ace158 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 12:36:06 +0200
Subject: [PATCH 19/25] test: actually run the gpu docs example on GPU; align
 collector/guard scope

Address roborev branch-review findings (job 188):
- Medium: the gpu docs example ran in NO environment. The gputest env lacks
  pytest-examples, so test_docs.py's module-level importorskip("pytest_examples")
  skipped the whole module under `pytest -m gpu` -- the gpu case was never
  collected even on GPU hardware, yet the guard reported it "validated" via
  test="true". Add pytest-examples to the gputest env; confirmed gpu-demo now
  collects under `hatch -e gputest run pytest -m gpu --co`.
- Low: _session_params (collection) didn't exclude docs/superpowers/ while the
  guard did -- an asymmetry that could run a cache-doc block as a real test
  without the guard flagging it. Extract a shared _is_published_docs() helper
  used by both, so collection and guard agree on scope.

Verified: doctest suite green (57 passed, 2 skipped), gpu-demo collectable in
gputest, mkdocs build --strict exits 0, prek clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 pyproject.toml     |  4 ++++
 tests/test_docs.py | 22 +++++++++++++++++-----
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/pyproject.toml b/pyproject.toml
index bc95bfd61b..92312d2630 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -204,6 +204,10 @@ list-env = "pip list"
 template = "test"
 extra-dependencies = [
     "universal_pathlib",
+    # Needed so tests/test_docs.py is collectable under `pytest -m gpu`; otherwise its
+    # module-level importorskip("pytest_examples") skips the whole module and the gpu
+    # docs example is never executed on GPU hardware.
+    "pytest-examples",
 ]
 features = ["gpu"]
 
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 11ba8836ec..07d7db94e2 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -55,6 +55,19 @@ def _is_tested(settings: dict[str, str]) -> bool:
     return settings.get("exec") == "true" or settings.get("test") == "true"
 
 
+def _is_published_docs(path: str) -> bool:
+    """Whether a code example belongs to the published documentation. docs/superpowers/
+    holds design-doc caches (plans/specs) that are not in the mkdocs nav; both the test
+    collector and the guard exclude it so they agree on what counts as real docs."""
+    try:
+        rel = Path(path).relative_to(DOCS_ROOT)
+    except ValueError:
+        # Path is outside DOCS_ROOT (e.g. a tmp_path fixture in unit tests); treat it as
+        # in-scope so such tests exercise the normal path.
+        return True
+    return not (rel.parts and rel.parts[0] == "superpowers")
+
+
 def _session_params(root: Path) -> list[Any]:
     """Group tested examples (exec="true" or test="true") by (file, session) and emit one
     pytest.param per session, carrying the union of markers declared by that session's
@@ -63,6 +76,8 @@ def _session_params(root: Path) -> list[Any]:
     marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set)
 
     for example in find_examples(str(root)):
+        if not _is_published_docs(str(example.path)):
+            continue
         settings = example.prefix_settings()
         if not _is_tested(settings):
             continue
@@ -153,12 +168,9 @@ def test_no_unvalidated_blocks() -> None:
     the page (or on a page where they are the only python block, like gpu.md)."""
     offenders: list[str] = []
     for example in find_examples(str(DOCS_ROOT)):
-        rel = Path(example.path).relative_to(DOCS_ROOT)
-        # docs/superpowers/ holds design-doc caches (plans/specs), not published
-        # documentation -- it is not in the mkdocs nav -- so its illustrative
-        # fences are not subject to the execution guard.
-        if rel.parts and rel.parts[0] == "superpowers":
+        if not _is_published_docs(str(example.path)):
             continue
+        rel = Path(example.path).relative_to(DOCS_ROOT)
         settings = example.prefix_settings()
         exec_val = settings.get("exec")
         loc = f"{rel}:{example.start_line}"

From 9f837f883a0e2386f15d29097b9b50408a853b9e Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 13:42:58 +0200
Subject: [PATCH 20/25] test: enforce test-only block placement; drop redundant
 marker round-trip

Address roborev branch-review findings (job 190):
- Low: the "test=true-only block must not precede an exec=true block on the same
  page" constraint was documented but unenforced (only caught by mkdocs --strict).
  Add test_test_only_blocks_come_last, which fails fast/locally when a test-only
  block has a smaller start_line than a later same-file exec=true block. Negative-
  checked: it catches an exec block added after the gpu test-only block.
- Low: _markers_for built pytest.MarkDecorator objects that _session_params
  immediately reduced to .name and rebuilt; replace with _marker_names returning
  raw strings, building decorators once at param time.

Verified: doctest suite green (58 passed, 2 skipped), placement test passes and
fails on a planted violation, prek clean, mkdocs build --strict exits 0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/test_docs.py | 53 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 43 insertions(+), 10 deletions(-)

diff --git a/tests/test_docs.py b/tests/test_docs.py
index 07d7db94e2..31886386da 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -40,10 +40,9 @@ def name_example(path: str, session: str) -> str:
     return f"{file}:{session}"
 
 
-def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]:
-    """Translate a block's markers="a b" attribute into pytest mark decorators."""
-    raw = settings.get("markers", "")
-    return [getattr(pytest.mark, name) for name in raw.split() if name]
+def _marker_names(settings: dict[str, str]) -> list[str]:
+    """Parse a block's markers="a b" attribute into a list of marker names."""
+    return [name for name in settings.get("markers", "").split() if name]
 
 
 def _is_tested(settings: dict[str, str]) -> bool:
@@ -84,8 +83,7 @@ def _session_params(root: Path) -> list[Any]:
         session_name = settings.get("session", "_default")
         key = (str(example.path), session_name)
         sessions[key].append(example)
-        for mark in _markers_for(settings):
-            marks_by_session[key].add(mark.name)
+        marks_by_session[key].update(_marker_names(settings))
 
     params = []
     for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])):
@@ -162,10 +160,7 @@ def test_no_unvalidated_blocks() -> None:
     mistyped fence (e.g. exec="on") fails here, so a block can never silently opt out of
     validation -- the gap that hid the invalid create_array(mode="w") example in #4016.
 
-    Note on placement: a test="true"-only block (which markdown-exec does not execute)
-    must not sit *before* an exec="true" block of the same page's session, or it disrupts
-    markdown-exec's build-time execution of the later block. Keep test-only blocks last on
-    the page (or on a page where they are the only python block, like gpu.md)."""
+    A separate placement constraint is enforced by test_test_only_blocks_come_last."""
     offenders: list[str] = []
     for example in find_examples(str(DOCS_ROOT)):
         if not _is_published_docs(str(example.path)):
@@ -192,6 +187,44 @@ def test_no_unvalidated_blocks() -> None:
     )
 
 
+def test_test_only_blocks_come_last() -> None:
+    """A test="true"-only block (one markdown-exec does not execute, because it lacks
+    exec="true") must not precede an exec="true" block in the same file. markdown-exec's
+    SuperFences validator rejects the unexecuted python fence, which disrupts its
+    build-time execution of any later exec="true" block on the page (observed: the
+    quickstart ZipStore example failed with FileNotFoundError, aborting `mkdocs build
+    --strict`). Enforcing the ordering here turns that build-only failure into a fast,
+    local unit failure."""
+    # Collect, per published-docs file, the start lines of test-only and exec blocks.
+    test_only: defaultdict[str, list[int]] = defaultdict(list)
+    exec_lines: defaultdict[str, list[int]] = defaultdict(list)
+    for example in find_examples(str(DOCS_ROOT)):
+        if not _is_published_docs(str(example.path)):
+            continue
+        settings = example.prefix_settings()
+        path = str(example.path)
+        if settings.get("exec") == "true":
+            exec_lines[path].append(example.start_line)
+        elif settings.get("test") == "true":
+            test_only[path].append(example.start_line)
+
+    offenders: list[str] = []
+    for path, only_lines in test_only.items():
+        rel = Path(path).relative_to(DOCS_ROOT)
+        last_exec = max(exec_lines.get(path, [0]))
+        offenders.extend(
+            f'{rel}:{line} (test="true" block precedes an exec="true" block at line {last_exec})'
+            for line in only_lines
+            if line < last_exec
+        )
+
+    assert not offenders, (
+        'A test="true"-only block must come after every exec="true" block in the same '
+        "file (markdown-exec executes the later block at build time and a preceding "
+        "unexecuted python fence breaks it):\n" + "\n".join(offenders)
+    )
+
+
 # Get all example sessions
 @pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT))
 def test_documentation_examples(

From f936fba0d6c8674b4f5562ff172ea5a09ae9463e Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 15:12:13 +0200
Subject: [PATCH 21/25] ci: make docs build strict; correct placement-hazard
 scope

Address roborev branch-review finding (job 192): the placement guard's docstring
claimed any non-exec="true" python fence before an exec="true" block breaks the
build, but the guard only checked test="true" blocks, leaving exec="false"-before-
exec="true" arrangements (data_types.md, performance.md) unguarded.

Investigation (experiments + markdown-exec's SuperFences integration) established the
real mechanism: a non-executed python fence (test="true" OR exec="false") before an
exec="true" block disrupts build-time execution of a later *state-dependent* block
(needs a cross-block dependency to surface, which is why the standalone exec="true"
blocks under those exec="false" opt-outs build fine). CI ran non-strict `mkdocs build`,
so such a failure would have merged as a silent warning.

- ci/docs.yml: `mkdocs build` -> `mkdocs build --strict` so any build-time exec
  failure (including the exec="false" case) fails CI authoritatively. Verified: a
  clean build currently emits zero warnings, so --strict passes today.
- Narrow the placement guard's docstring and the spec to the real (state-dependent)
  mechanism, and frame test_test_only_blocks_come_last as a conservative fast-feedback
  convention with --strict as the authoritative check -- no longer over-claiming.

Verified: docs suite green (58 passed, 2 skipped), `mkdocs build --strict` exits 0,
prek clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .github/workflows/docs.yml                    |  5 ++-
 ...2026-05-29-docs-block-validation-design.md | 28 +++++++++++------
 tests/test_docs.py                            | 31 +++++++++++++------
 3 files changed, 45 insertions(+), 19 deletions(-)

diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
index 6515a2c4c2..fb5487ada4 100644
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@@ -24,7 +24,10 @@ jobs:
           persist-credentials: false
       - uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
       - run: uv sync --group docs
-      - run: uv run mkdocs build
+      # --strict turns warnings into errors, so a docs code block that fails to execute
+      # at build time (e.g. a non-exec python fence disrupting a later exec="true" block)
+      # fails CI instead of merging as a silent warning.
+      - run: uv run mkdocs build --strict
         env:
           DISABLE_MKDOCS_2_WARNING: "true"
           NO_MKDOCS_2_WARNING: "true"
diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
index 643c454741..cbbe689994 100644
--- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
+++ b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
@@ -130,15 +130,25 @@ test-only blocks add `test="true"` without `exec`.
 | GPU / S3 examples | — | `true` | Tested (under markers); rendered static at build. |
 | Non-runnable (transcript, include, wrong-import) | `false`+`reason` | — | Neither; explicit reasoned opt-out. |
 
-**Placement constraint (markdown-exec quirk).** markdown-exec's SuperFences validator
-*rejects* a `python` fence that lacks `exec="true"` (returns `False`, so it is not run at
-build). A rejected fence positioned **before** an `exec="true"` block of the same page
-disrupts markdown-exec's build-time execution of that later block — observed concretely: a
-`test="true"` S3 block placed above the quickstart `ZipStore` example made the ZipStore
-block fail at build (`FileNotFoundError`, the zip was never written) and `mkdocs build
---strict` aborted. Fix: **a `test="true"`-only block must come last on its page** (or be
-the only python block on the page, as on `gpu.md`). The S3 example is therefore placed at
-the end of `quick-start.md`. The guard test docstring records this.
+**Placement constraint (markdown-exec quirk).** markdown-exec registers a SuperFences
+custom fence for `python`; its validator *rejects* any fence lacking `exec="true"`
+(`exec="false"` and `test="true"` alike — both are "not executed at build"). Established by
+experiment: a rejected python fence positioned **before** an `exec="true"` block disrupts
+markdown-exec's build-time execution of a **later, state-dependent** block (regardless of
+session). Observed concretely: any non-exec python fence inserted before the quickstart
+`ZipStore` write/read pair made the read block fail (`FileNotFoundError` — the write never
+took effect) and `mkdocs build --strict` aborted. The effect only surfaces with a
+cross-block dependency, so it does **not** affect the standalone `exec="true"` blocks in
+`data_types.md`/`performance.md` that already carry `exec="false"` opt-out blocks above
+them.
+
+Because we cannot statically tell which later blocks are state-dependent, the response is
+twofold: (1) **a `test="true"`-only block must come last on its page** (or be the only
+python block, as on `gpu.md`) — a conservative convention enforced by
+`test_test_only_blocks_come_last` for the blocks we author this way; and (2) the
+**authoritative** build-hazard check is `mkdocs build --strict` (the `docs:check` CI job),
+which catches the `exec="false"` case too. The S3 example is placed at the end of
+`quick-start.md` accordingly.
 
 ## Marker-bound execution
 
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 31886386da..9561d0ad05 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -188,13 +188,25 @@ def test_no_unvalidated_blocks() -> None:
 
 
 def test_test_only_blocks_come_last() -> None:
-    """A test="true"-only block (one markdown-exec does not execute, because it lacks
-    exec="true") must not precede an exec="true" block in the same file. markdown-exec's
-    SuperFences validator rejects the unexecuted python fence, which disrupts its
-    build-time execution of any later exec="true" block on the page (observed: the
-    quickstart ZipStore example failed with FileNotFoundError, aborting `mkdocs build
-    --strict`). Enforcing the ordering here turns that build-only failure into a fast,
-    local unit failure."""
+    """A conservative placement convention: a test="true"-only block must come after every
+    exec="true" block in the same file.
+
+    Mechanism (established by experiment + markdown-exec's SuperFences integration): a
+    python fence that markdown-exec does not execute -- i.e. one lacking exec="true",
+    whether test="true" or exec="false" -- placed before an exec="true" block disrupts
+    markdown-exec's build-time execution of a *later, state-dependent* block. Observed: a
+    non-exec python fence inserted before the quickstart ZipStore write/read pair made the
+    read block fail with FileNotFoundError (the write never took effect), aborting
+    `mkdocs build --strict`. The effect needs a cross-block dependency to surface, so it
+    does not affect the standalone exec="true" blocks in e.g. data_types.md/performance.md
+    that already have exec="false" opt-out blocks above them.
+
+    Because we cannot statically tell which later blocks are state-dependent, this guard
+    enforces the simple, safe convention only for the blocks we author this way
+    (test="true" marker-bound examples like s3/gpu). It is NOT a complete build-hazard
+    check -- the authoritative check is `mkdocs build --strict` (the docs:check CI job),
+    which catches the exec="false" case too. This guard just turns the common test-only
+    case into a fast, local failure."""
     # Collect, per published-docs file, the start lines of test-only and exec blocks.
     test_only: defaultdict[str, list[int]] = defaultdict(list)
     exec_lines: defaultdict[str, list[int]] = defaultdict(list)
@@ -220,8 +232,9 @@ def test_test_only_blocks_come_last() -> None:
 
     assert not offenders, (
         'A test="true"-only block must come after every exec="true" block in the same '
-        "file (markdown-exec executes the later block at build time and a preceding "
-        "unexecuted python fence breaks it):\n" + "\n".join(offenders)
+        'file: a non-executed python fence before an exec="true" block can disrupt '
+        "markdown-exec's build-time execution of a later state-dependent block (see this "
+        "test's docstring):\n" + "\n".join(offenders)
     )
 
 

From ee82f5e34b1a72ebf3bf4d6e6a4833a7ff124fed Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 15:27:17 +0200
Subject: [PATCH 22/25] docs: remove design spec/plan caches from version
 control

The spec and implementation plan under docs/superpowers/ were working artifacts,
not published documentation (they were never in the mkdocs nav). The spec is
preserved in a public gist; the plan is a local execution record. Remove both from
the repo and drop the now-stale spec path from the test_docs.py module docstring.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../plans/2026-05-29-docs-block-validation.md | 749 ------------------
 ...2026-05-29-docs-block-validation-design.md | 260 ------
 tests/test_docs.py                            |   5 +-
 3 files changed, 3 insertions(+), 1011 deletions(-)
 delete mode 100644 docs/superpowers/plans/2026-05-29-docs-block-validation.md
 delete mode 100644 docs/superpowers/specs/2026-05-29-docs-block-validation-design.md

diff --git a/docs/superpowers/plans/2026-05-29-docs-block-validation.md b/docs/superpowers/plans/2026-05-29-docs-block-validation.md
deleted file mode 100644
index 63ab0b696d..0000000000
--- a/docs/superpowers/plans/2026-05-29-docs-block-validation.md
+++ /dev/null
@@ -1,749 +0,0 @@
-# Docs Block Validation Implementation Plan
-
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-
-**Goal:** Make every python code block in `docs/` either execute (and thus get validated) or explicitly opt out with a documented reason, and add a guard test so a block can never again silently opt out of validation.
-
-**Architecture:** The doctests in `tests/test_docs.py` are already parametrized pytest tests. We (1) teach the parametrizer to read a `markers="..."` fence attribute and attach the matching pytest marker to each session's `pytest.param`, (2) add an `s3` marker bound to a `moto` mock-S3 fixture so the S3 example runs in the default doctest env, (3) reuse the existing `gpu` marker for the GPU block, (4) remediate the 12 currently-unexecuted blocks per-case, and (5) add a guard test asserting every docs python block is `exec="true"` or explicitly opted out with a reason.
-
-**Tech Stack:** pytest, pytest-examples, markdown-exec (mkdocs), moto[s3,server], s3fs, hatch envs (`doctest`, `gputest`).
-
-**Upstream:** Fixes [#4016](https://github.com/zarr-developers/zarr-python/issues/4016); implements the guard from [#4017](https://github.com/zarr-developers/zarr-python/issues/4017). Design spec: `docs/superpowers/specs/2026-05-29-docs-block-validation-design.md`.
-
----
-
-## File Structure
-
-- `tests/test_docs.py` — **modify.** Add `markers=` parsing in `group_examples_by_session()`, an `s3` fixture + marker-binding, and the new `test_no_unvalidated_blocks` guard test.
-- `pyproject.toml` — **modify.** Register the `s3` marker in `[tool.pytest.ini_options] markers`.
-- `docs/quick-start.md` — **modify.** S3 block: fix `mode="w"`, add `markers="s3"`, make it executable.
-- `docs/user-guide/performance.md` — **modify.** Turn on the two config-only blocks; opt out (or fix) the dask block.
-- `docs/user-guide/arrays.md` — **modify.** Turn on the config block.
-- `docs/user-guide/cli.md` — **modify.** Make the `zarr.open` block runnable or opt it out.
-- `docs/user-guide/gpu.md` — **modify.** Add `exec="true" markers="gpu"`.
-- `docs/contributing.md` — **modify.** Fix `exec="on"` typo; opt out the pseudocode block.
-- `docs/user-guide/data_types.md` — **modify.** Opt out the REPL-transcript block.
-- `docs/user-guide/examples/custom_dtype.md` — **modify.** Opt out the `--8<--` include block.
-- `docs/user-guide/v3_migration.md` — **modify.** Opt out the intentionally-wrong-import block.
-- `changes/4016.bugfix.md` — **create.** Towncrier news fragment.
-
-### Opt-out convention (decided here, used throughout)
-
-A block that must not execute is tagged:
-
-````
-```python exec="false" reason="<human-readable reason>"
-````
-
-- `exec="false"` is an explicit, greppable opt-out that `markdown-exec` will **not** execute (only `exec="true"` triggers execution).
-- `reason="..."` documents *why*. The guard test requires it on any non-`exec="true"` block.
-
----
-
-## Task 1: Spike — can the `s3` fixture provide a default endpoint with no `storage_options`?
-
-This is the load-bearing unknown. The existing S3 tests always pass `endpoint_url` explicitly via `client_kwargs`/`storage_options` (`tests/test_store/test_fsspec.py:109-116, 131`). The docs block must read clean — `zarr.create_array("s3://...")` with **no** `storage_options`. We must confirm a process-wide default endpoint works before writing the real fixture.
-
-**Files:**
-- Test (scratch): `tests/test_docs_s3_spike.py` (deleted at end of task)
-
-- [ ] **Step 1: Write a scratch test that starts moto, sets a default endpoint via env, and creates an array with a bare `s3://` URL**
-
-```python
-# tests/test_docs_s3_spike.py
-import os
-
-import pytest
-
-moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
-pytest.importorskip("s3fs")
-botocore = pytest.importorskip("botocore")
-requests = pytest.importorskip("requests")
-
-PORT = 5556  # different from test_fsspec.py's 5555 to avoid collisions
-ENDPOINT = f"http://127.0.0.1:{PORT}/"
-
-
-def test_bare_s3_url_with_default_endpoint() -> None:
-    """A create_array('s3://...') call with no storage_options should reach a
-    moto server when the endpoint is configured process-wide (env var)."""
-    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=PORT)
-    server.start()
-    try:
-        os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
-        os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
-        # Candidate mechanism A: aiobotocore/botocore honors AWS_ENDPOINT_URL
-        os.environ["AWS_ENDPOINT_URL"] = ENDPOINT
-
-        # create the bucket via boto3 sync client
-        session = botocore.session.Session()
-        client = session.create_client("s3", endpoint_url=ENDPOINT, region_name="us-east-1")
-        client.create_bucket(Bucket="docs-bucket")
-        client.close()
-
-        import s3fs
-
-        import zarr
-
-        s3fs.S3FileSystem.clear_instance_cache()
-        z = zarr.create_array(
-            "s3://docs-bucket/foo", shape=(8, 8), chunks=(4, 4), dtype="f4"
-        )
-        z[:, :] = 1.0
-        assert z[0, 0] == 1.0
-    finally:
-        requests.post(f"{ENDPOINT}/moto-api/reset")
-        server.stop()
-```
-
-- [ ] **Step 2: Run the spike**
-
-Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v` (or `uv run pytest tests/test_docs_s3_spike.py -v` inside the doctest env)
-Expected: **One of two outcomes** — record which:
-- **PASS** → `AWS_ENDPOINT_URL` works as a process-wide default. Use env-var mechanism in Task 3.
-- **FAIL** (connection refused / NoCredentials / hits real AWS) → env var insufficient. Try candidate B below.
-
-- [ ] **Step 3: If Step 2 failed, try fsspec default config**
-
-Replace the `AWS_ENDPOINT_URL` line with:
-
-```python
-        import fsspec
-
-        fsspec.config.conf["s3"] = {"client_kwargs": {"endpoint_url": ENDPOINT}, "anon": False}
-```
-
-Run: `hatch run doctest:test tests/test_docs_s3_spike.py -v`
-Expected: PASS → use `fsspec.config.conf` mechanism in Task 3.
-
-- [ ] **Step 4: If both failed, record the fallback decision**
-
-If neither bare-URL mechanism works, the visible block will show `storage_options={"endpoint_url": ...}` honestly (spec fallback for spike #1). Note which mechanism (env var, fsspec config, or fallback) won, in the commit message — Task 3 depends on it.
-
-- [ ] **Step 5: Delete the scratch test and commit the finding**
-
-```bash
-git rm tests/test_docs_s3_spike.py
-git commit -m "test: spike s3 default-endpoint mechanism for docs (no storage_options)
-
-Result: <env-var | fsspec-config | fallback-to-storage_options>"
-```
-
-### RESULT (completed 2026-05-29, commit 460385d)
-
-- **Mechanism A won:** `os.environ["AWS_ENDPOINT_URL"] = ENDPOINT` (+ dummy
-  `AWS_SECRET_ACCESS_KEY`/`AWS_ACCESS_KEY_ID`) makes a bare
-  `create_array("s3://...")` reach moto with no `storage_options`. `clear_instance_cache()`
-  alone sufficed; the `set_session()`/`skip_instance_cache` dance from `test_fsspec.py`
-  was not needed for a single fixture. No event-loop or teardown warnings observed.
-- **PLAN CORRECTION (important):** the `doctest` hatch env does **NOT** install moto.
-  `moto[s3,server]` is only in the `remote-tests` dependency group; the `doctest` env
-  (`pyproject.toml` ~line 277-284) has only `s3fs` + `pytest-examples` as extras. **Task 3
-  MUST add `moto[s3,server]` and `requests` to `[tool.hatch.envs.doctest] extra-dependencies`**,
-  or any moto-backed doctest will silently `importorskip`-skip — defeating the purpose.
-  Task 4's verification must assert the S3 case **runs**, not skips.
-
----
-
-## Task 2: Register the `s3` pytest marker
-
-**Files:**
-- Modify: `pyproject.toml` (the `[tool.pytest.ini_options]` `markers` list, currently at lines 446-450)
-
-- [ ] **Step 1: Add the `s3` marker**
-
-In `pyproject.toml`, change the `markers` list from:
-
-```toml
-markers = [
-    "asyncio: mark test as asyncio test",
-    "gpu: mark a test as requiring CuPy and GPU",
-    "slow_hypothesis: slow hypothesis tests",
-]
-```
-
-to:
-
-```toml
-markers = [
-    "asyncio: mark test as asyncio test",
-    "gpu: mark a test as requiring CuPy and GPU",
-    "s3: mark a test as requiring a (mock) S3 backend via moto",
-    "slow_hypothesis: slow hypothesis tests",
-]
-```
-
-- [ ] **Step 2: Verify pytest accepts the marker (no unknown-marker warning)**
-
-Run: `hatch run doctest:test --markers | grep s3`
-Expected: shows `@pytest.mark.s3: mark a test as requiring a (mock) S3 backend via moto`
-
-- [ ] **Step 3: Commit**
-
-```bash
-git add pyproject.toml
-git commit -m "test: register s3 pytest marker"
-```
-
----
-
-## Task 3: Teach `test_docs.py` to parse `markers=` and bind the `s3` fixture
-
-This task adds (a) `markers=` parsing so a session carries the right pytest marker, and (b) the moto-backed `s3` fixture using the mechanism chosen in Task 1.
-
-**Files:**
-- Modify: `tests/test_docs.py`
-
-- [ ] **Step 1: Write a failing test that a markered session carries its marker**
-
-Add to `tests/test_docs.py`:
-
-```python
-def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
-    """A block tagged markers="s3" must surface that marker on its parametrized case,
-    so pytest can gate/bind it (e.g. attach the moto fixture)."""
-    md = tmp_path / "ex.md"
-    md.write_text(
-        '```python exec="true" session="demo" markers="s3"\n'
-        "import zarr\n"
-        "```\n",
-        encoding="utf-8",
-    )
-    params = _session_params(md.parent)
-    assert len(params) == 1
-    marks = params[0].marks
-    assert any(m.name == "s3" for m in marks)
-```
-
-(This references a new helper `_session_params(root)` that returns a list of `pytest.param(...)`; we extract the grouping logic into it in Step 3.)
-
-- [ ] **Step 2: Run it to confirm it fails**
-
-Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v`
-Expected: FAIL with `AttributeError: module ... has no attribute '_session_params'` (or `NameError`).
-
-- [ ] **Step 3: Refactor grouping into `_session_params` that emits markers**
-
-Replace `group_examples_by_session()` (currently `tests/test_docs.py:39-64`) and the parametrize decorator (`tests/test_docs.py:72-75`) with a version that returns `pytest.param` objects carrying marks. Add near the top of the file:
-
-```python
-def _markers_for(settings: dict[str, str]) -> list[pytest.MarkDecorator]:
-    """Translate a block's markers="a b" attribute into pytest mark decorators."""
-    raw = settings.get("markers", "")
-    return [getattr(pytest.mark, name) for name in raw.split() if name]
-
-
-def _session_params(root: Path) -> list[pytest.param]:
-    """Group exec="true" examples by (file, session) and emit one pytest.param per
-    session, carrying the union of markers declared by that session's blocks."""
-    sessions: defaultdict[tuple[str, str], list[CodeExample]] = defaultdict(list)
-    marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set)
-
-    for example in find_examples(str(root)):
-        settings = example.prefix_settings()
-        if settings.get("exec") != "true":
-            continue
-        session_name = settings.get("session", "_default")
-        key = (str(example.path), session_name)
-        sessions[key].append(example)
-        for mark in _markers_for(settings):
-            marks_by_session[key].add(mark.name)
-
-    params = []
-    for key in sorted(sessions.keys(), key=lambda x: (x[0], x[1])):
-        marks = tuple(getattr(pytest.mark, name) for name in sorted(marks_by_session[key]))
-        params.append(pytest.param(key, marks=marks, id=name_example(key[0], key[1])))
-    return params
-```
-
-Keep `name_example()` as-is. Add `CodeExample` to the existing pytest-examples import if not already imported (it is: `from pytest_examples import CodeExample, EvalExample, find_examples`).
-
-- [ ] **Step 4: Update the parametrized test to use `_session_params` and request the fixtures**
-
-Replace the decorator + signature of `test_documentation_examples` (`tests/test_docs.py:72-79`) with:
-
-```python
-@pytest.mark.parametrize("session_key", _session_params(DOCS_ROOT))
-def test_documentation_examples(
-    session_key: tuple[str, str],
-    eval_example: EvalExample,
-    request: pytest.FixtureRequest,
-) -> None:
-```
-
-Inside the body, before running examples, activate the `s3` fixture when the case is s3-marked:
-
-```python
-    if request.node.get_closest_marker("s3") is not None:
-        request.getfixturevalue("docs_s3_backend")
-```
-
-(Leave the rest of the body — the `find_examples` loop and `eval_example.run(...)` — unchanged.)
-
-- [ ] **Step 5: Add the `docs_s3_backend` fixture**
-
-Add to `tests/test_docs.py` (using the mechanism Task 1 selected — shown here for the `AWS_ENDPOINT_URL` variant; swap to `fsspec.config` or the `storage_options` fallback per Task 1's result):
-
-```python
-S3_PORT = 5556
-S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/"
-S3_BUCKET = "example-bucket"
-
-
-@pytest.fixture
-def docs_s3_backend() -> Generator[None, None, None]:
-    """Stand up a moto mock-S3 server and configure a process-wide default endpoint
-    so docs blocks can use a bare s3:// URL with no storage_options."""
-    moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
-    s3fs = pytest.importorskip("s3fs")
-    botocore = pytest.importorskip("botocore")
-    requests = pytest.importorskip("requests")
-
-    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT)
-    server.start()
-    prev_endpoint = os.environ.get("AWS_ENDPOINT_URL")
-    os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
-    os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
-    os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT
-
-    session = botocore.session.Session()
-    client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1")
-    client.create_bucket(Bucket=S3_BUCKET)
-    client.close()
-    s3fs.S3FileSystem.clear_instance_cache()
-    try:
-        yield
-    finally:
-        requests.post(f"{S3_ENDPOINT}/moto-api/reset")
-        if prev_endpoint is None:
-            os.environ.pop("AWS_ENDPOINT_URL", None)
-        else:
-            os.environ["AWS_ENDPOINT_URL"] = prev_endpoint
-        server.stop()
-```
-
-Add the required imports at the top of `tests/test_docs.py`:
-
-```python
-import os
-from collections.abc import Generator
-```
-
-- [ ] **Step 6: Run the marker-parsing test — it should now pass**
-
-Run: `hatch run doctest:test tests/test_docs.py::test_markers_attribute_is_parsed -v`
-Expected: PASS
-
-- [ ] **Step 7: Run the full docs test to confirm no regression in existing sessions**
-
-Run: `hatch run doctest:test -v`
-Expected: PASS for all existing `quickstart` etc. sessions (the S3 block isn't markered yet — that's Task 4).
-
-- [ ] **Step 8: Commit**
-
-```bash
-git add tests/test_docs.py
-git commit -m "test: parse markers= on docs blocks and add moto s3 fixture binding"
-```
-
----
-
-## Task 4: Fix and enable the S3 example (#4016)
-
-**Files:**
-- Modify: `docs/quick-start.md:134-140`
-
-- [ ] **Step 1: Replace the bare, invalid S3 block**
-
-Replace lines 134-140 (the ```` ```python `` … ```` block containing `mode="w"`) with:
-
-````markdown
-```python exec="true" session="s3demo" markers="s3" source="above"
-import zarr
-import numpy as np
-
-z = zarr.create_array(
-    "s3://example-bucket/foo", shape=(100, 100), chunks=(10, 10), dtype="f4"
-)
-z[:, :] = np.random.random((100, 100))
-```
-````
-
-Notes:
-- `mode="w"` removed (the #4016 bug; `create_array` has no `mode` parameter — see `src/zarr/api/synchronous.py:799`).
-- Unused `import s3fs` removed.
-- `import numpy as np` added — this is a fresh `s3demo` session, so `np` is not in scope from the `quickstart` session.
-- New session `s3demo` keeps the moto fixture scoped to just this block (the `quickstart` session must NOT become s3-marked).
-- The displayed URL stays `s3://example-bucket/foo`; the moto endpoint is supplied by the `docs_s3_backend` fixture (bucket name `example-bucket` matches `S3_BUCKET` in Task 3).
-- **If Task 1 chose the `storage_options` fallback:** add `storage_options={"endpoint_url": "..."}` to the visible call instead, and adjust the prose to explain it.
-
-- [ ] **Step 2: Run the S3 docs example against moto**
-
-Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[quick-start.md:s3demo]" -v`
-Expected: PASS (executes against moto; no real-cloud contact).
-
-- [ ] **Step 3: Commit**
-
-```bash
-git add docs/quick-start.md
-git commit -m "docs: fix invalid s3 create_array example and run it against moto (#4016)"
-```
-
----
-
-## Task 5: Enable the config-only blocks
-
-These are plain `zarr.config.set(...)` calls that run as-is. Each gets its own self-contained session so config mutations don't bleed into other examples (config is process-global; reset is out of scope — separate sessions keep ids distinct but note config is not auto-restored, which is acceptable for these read-only-style demos).
-
-**Files:**
-- Modify: `docs/user-guide/performance.md:207`, `docs/user-guide/performance.md:237`
-- Modify: `docs/user-guide/arrays.md:622`
-
-- [ ] **Step 1: Enable `performance.md:207` (concurrency config)**
-
-Change the fence from ```` ```python ```` to:
-
-````markdown
-```python exec="true" session="perf-concurrency"
-````
-
-(Body unchanged — `import zarr` + `zarr.config.set({'async.concurrency': 128})` + the commented env-var line, which is inert.)
-
-- [ ] **Step 2: Enable `performance.md:237` (max_workers config)**
-
-Change the fence to:
-
-````markdown
-```python exec="true" session="perf-workers"
-````
-
-- [ ] **Step 3: Enable `arrays.md:622` (rectilinear_chunks config)**
-
-Change the fence to:
-
-````markdown
-```python exec="true" session="arrays-rectilinear"
-````
-
-- [ ] **Step 4: Run the three sessions**
-
-Run:
-```bash
-hatch run doctest:test \
-  "tests/test_docs.py::test_documentation_examples[performance.md:perf-concurrency]" \
-  "tests/test_docs.py::test_documentation_examples[performance.md:perf-workers]" \
-  "tests/test_docs.py::test_documentation_examples[arrays.md:arrays-rectilinear]" -v
-```
-Expected: PASS (3 passed).
-
-- [ ] **Step 5: Commit**
-
-```bash
-git add docs/user-guide/performance.md docs/user-guide/arrays.md
-git commit -m "docs: execute config-setting examples in performance.md and arrays.md"
-```
-
----
-
-## Task 6: Make the CLI `zarr.open` block runnable
-
-`docs/user-guide/cli.md:48` opens `'path/to/input.zarr'` which doesn't exist. Rewrite it to create then open a real local array so it executes and still illustrates `zarr_format=3`.
-
-**Files:**
-- Modify: `docs/user-guide/cli.md:46-51`
-
-- [ ] **Step 1: Replace the block**
-
-Replace the bare block with:
-
-````markdown
-```python exec="true" session="cli-open" source="above"
-import zarr
-
-# create a small array to open (stands in for the migrated store)
-zarr.create_array("data/cli-demo.zarr", shape=(4, 4), chunks=(2, 2), dtype="i4")
-
-zarr_with_v3_metadata = zarr.open("data/cli-demo.zarr", zarr_format=3)
-```
-````
-
-(Keep the surrounding prose; the example now demonstrates `open(..., zarr_format=3)` on a real store. The illustrative `'path/to/input.zarr'` filename was the only reason it couldn't run.)
-
-- [ ] **Step 2: Run it**
-
-Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[cli.md:cli-open]" -v`
-Expected: PASS
-
-- [ ] **Step 3: Commit**
-
-```bash
-git add docs/user-guide/cli.md
-git commit -m "docs: make cli zarr.open example runnable against a local store"
-```
-
----
-
-## Task 7: Enable the GPU block (env-gated via `gpu` marker)
-
-**Files:**
-- Modify: `docs/user-guide/gpu.md:19-28`
-
-- [ ] **Step 1: Tag the GPU block**
-
-Change the fence from ```` ```python ```` to:
-
-````markdown
-```python exec="true" session="gpu-demo" markers="gpu" source="above"
-````
-
-(Body unchanged: `import cupy as cp`, `zarr.config.enable_gpu()`, `create_array("memory://gpu-demo", ...)`, etc.)
-
-> **PLAN CORRECTION (found during execution, commit 010d99a):** a registered marker
-> does NOT auto-skip a test under plain `pytest` — markers only *filter* when you pass
-> `-m`. Without a guard, the gpu block runs and FAILS with `ModuleNotFoundError: cupy`
-> (cupy is darwin-excluded). The repo's real convention is `pytest.importorskip("cupy")`
-> in the test body (cf. `tests/conftest.py:183`). So Task 7 also adds to
-> `test_documentation_examples` (mirroring the `s3` binding):
-> ```python
->     if request.node.get_closest_marker("gpu") is not None:
->         pytest.importorskip("cupy")
-> ```
-> This converts the missing-cupy hard error into a proper SKIP in the default env, while
-> `-m gpu` in the `gputest` env still collects+runs it on real hardware.
-
-- [ ] **Step 2: Confirm it is SKIPPED in the default doctest env (no GPU)**
-
-Run: `hatch -e doctest run pytest "tests/test_docs.py::test_documentation_examples[user-guide/gpu.md:gpu-demo]" -v`
-Expected: SKIPPED (via `importorskip("cupy")`), **not** an error, **not** absent.
-
-- [ ] **Step 3: Confirm it is COLLECTED for the gpu selection**
-
-Run: `hatch run doctest:test -m gpu --co -q | grep gpu-demo`
-Expected: the `gpu.md:gpu-demo` case is collected (it will actually execute only on real GPU hardware in the `gputest` env, which we can't run here).
-
-- [ ] **Step 4: Commit**
-
-```bash
-git add docs/user-guide/gpu.md
-git commit -m "docs: execute gpu example under the gpu marker"
-```
-
----
-
-## Task 8: Fix the `exec="on"` typo and opt out the genuinely-non-executable blocks
-
-**Files:**
-- Modify: `docs/contributing.md:15` (pseudocode) and `docs/contributing.md:231` (`exec="on"` typo)
-- Modify: `docs/user-guide/data_types.md:363` (REPL transcript)
-- Modify: `docs/user-guide/examples/custom_dtype.md:5` (`--8<--` include)
-- Modify: `docs/user-guide/v3_migration.md:42` (intentionally-wrong import)
-
-- [ ] **Step 1: Fix the `exec="on"` typo in `contributing.md:231`**
-
-Change the fence attribute `exec="on"` to `exec="true"`. Then run that block to confirm it actually executes cleanly:
-
-Run: `hatch run doctest:test -v -k contributing`
-Expected: the formerly-`exec="on"` block now runs. **If it fails** (the code was broken too, having never run), fix the code in the block minimally so it passes, or — if it's not meant to run — convert it to `exec="false" reason="..."`. Record which in the commit.
-
-- [ ] **Step 2: Opt out `contributing.md:15` (pseudocode)**
-
-Change ```` ```python ```` to:
-
-````markdown
-```python exec="false" reason="illustrative pseudocode with a '# etc.' placeholder, not runnable"
-````
-
-- [ ] **Step 3: Opt out `data_types.md:363` (REPL transcript)**
-
-Change ```` ```python ```` to:
-
-````markdown
-```python exec="false" reason="REPL output transcript, not executable source"
-````
-
-- [ ] **Step 4: Opt out `custom_dtype.md:5` (`--8<--` include)**
-
-Change ```` ```python ```` to:
-
-````markdown
-```python exec="false" reason="pymdownx snippet include directive, not python source"
-````
-
-- [ ] **Step 5: Opt out `v3_migration.md:42` (intentionally-wrong import)**
-
-Change ```` ```python ```` to:
-
-````markdown
-```python exec="false" reason="intentionally shows the old/incorrect import for contrast"
-````
-
-- [ ] **Step 6: Commit**
-
-```bash
-git add docs/contributing.md docs/user-guide/data_types.md docs/user-guide/examples/custom_dtype.md docs/user-guide/v3_migration.md
-git commit -m "docs: fix exec=on typo and explicitly opt out non-runnable blocks"
-```
-
----
-
-## Task 9: Handle the dask block in performance.md
-
-`docs/user-guide/performance.md:263` uses `dask.array` and opens `'data/large_array.zarr'` (nonexistent). Two viable dispositions — pick based on whether `dask` is in the doctest env.
-
-**Files:**
-- Modify: `docs/user-guide/performance.md:263-280`
-
-- [ ] **Step 1: Check whether dask is available in the doctest env**
-
-Run: `hatch run doctest:list-env | grep -i dask`
-Expected: either shows a `dask` line (available) or nothing (not available).
-
-- [ ] **Step 2a: If dask IS available — make it runnable**
-
-Replace the `'data/large_array.zarr'` open with a created array, keeping the dask demonstration:
-
-````markdown
-```python exec="true" session="perf-dask" source="above"
-import zarr
-import dask.array as da
-
-zarr.config.set({
-    'async.concurrency': 4,
-    'threading.max_workers': 4,
-})
-
-# create a small array to read with Dask
-zarr.create_array("data/perf-dask-demo.zarr", shape=(16, 16), chunks=(8, 8), dtype="f4")
-z = zarr.open_array("data/perf-dask-demo.zarr", mode="r")
-
-arr = da.from_array(z, chunks=z.chunks)
-result = arr.mean(axis=0).compute()
-```
-````
-
-Run: `hatch run doctest:test "tests/test_docs.py::test_documentation_examples[performance.md:perf-dask]" -v`
-Expected: PASS
-
-- [ ] **Step 2b: If dask is NOT available — opt out with a reason**
-
-Change ```` ```python ```` to:
-
-````markdown
-```python exec="false" reason="requires dask, which is not in the docs test environment"
-````
-
-- [ ] **Step 3: Commit**
-
-```bash
-git add docs/user-guide/performance.md
-git commit -m "docs: make dask performance example runnable (or opt out if dask absent)"
-```
-
----
-
-## Task 10: Add the guard test
-
-The guard asserts every python block in `docs/` is either `exec="true"` or `exec="false"` with a non-empty `reason`. Anything else (bare, `exec="on"`, missing reason) fails.
-
-**Files:**
-- Modify: `tests/test_docs.py`
-
-- [ ] **Step 1: Write the guard test**
-
-Add to `tests/test_docs.py`:
-
-```python
-def test_no_unvalidated_blocks() -> None:
-    """Every python code block in docs/ must declare its validation state:
-    either exec="true" (it is executed as a test) or exec="false" with a reason
-    (an explicit, documented opt-out). A bare or mistyped fence (e.g. exec="on")
-    fails here, so a block can never silently opt out of validation — the gap
-    that hid the invalid create_array(mode="w") example in #4016."""
-    offenders: list[str] = []
-    for example in find_examples(str(DOCS_ROOT)):
-        settings = example.prefix_settings()
-        exec_val = settings.get("exec")
-        loc = f"{Path(example.path).relative_to(DOCS_ROOT)}:{example.start_line}"
-        if exec_val == "true":
-            continue
-        if exec_val == "false" and settings.get("reason", "").strip():
-            continue
-        offenders.append(f"{loc} (exec={exec_val!r}, reason={settings.get('reason')!r})")
-
-    assert not offenders, (
-        "Docs python blocks must be exec=\"true\" or exec=\"false\" with a reason:\n"
-        + "\n".join(offenders)
-    )
-```
-
-(`find_examples` from pytest-examples only yields fenced code blocks for languages it recognizes as runnable, which includes python; confirm in Step 2 that the count matches the audit. If it also yields non-python fences, filter on `example.prefix` / language — adjust to `if not str(example.path).endswith(".md"): continue` is unnecessary since DOCS_ROOT is all markdown.)
-
-- [ ] **Step 2: Run the guard — it must PASS now that all blocks are remediated**
-
-Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v`
-Expected: PASS (zero offenders). **If it lists offenders**, they are blocks missed by Tasks 4-9 — fix each (turn on or opt out) until the list is empty.
-
-- [ ] **Step 3: Negative check — confirm the guard actually catches a bare block**
-
-Temporarily add a bare block to any docs file:
-
-````markdown
-```python
-1 / 0
-```
-````
-
-Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v`
-Expected: FAIL, listing the new bare block's location.
-
-Then remove the temporary block and re-run:
-Run: `hatch run doctest:test tests/test_docs.py::test_no_unvalidated_blocks -v`
-Expected: PASS
-
-- [ ] **Step 4: Commit**
-
-```bash
-git add tests/test_docs.py
-git commit -m "test: guard that every docs python block is executed or opted out (#4017)"
-```
-
----
-
-## Task 11: Full suite + news fragment
-
-**Files:**
-- Create: `changes/4016.bugfix.md`
-
-- [ ] **Step 1: Run the entire docs test suite**
-
-Run: `hatch run doctest:test -v`
-Expected: PASS — all `exec="true"` sessions run (S3 against moto; config/cli/dask as applicable), the GPU session reports SKIPPED, and the guard passes.
-
-- [ ] **Step 2: Add the towncrier news fragment**
-
-Create `changes/4016.bugfix.md`:
-
-```markdown
-Fixed an invalid ``zarr.create_array`` example in the quick-start docs (it passed an unsupported ``mode`` argument) and made the cloud-storage example execute against a mock S3 backend in CI. Added a test ensuring every python code block in the docs is either executed or explicitly opted out with a documented reason.
-```
-
-- [ ] **Step 3: Run the full prek/lint pass**
-
-Run: `prek run --all-files`
-Expected: PASS (ruff, mypy, towncrier-check, etc. all green).
-
-- [ ] **Step 4: Commit**
-
-```bash
-git add changes/4016.bugfix.md
-git commit -m "docs: add news fragment for docs-block validation (#4016, #4017)"
-```
-
----
-
-## Self-review notes (resolved during planning)
-
-- **Spec coverage:** Part A (remediate 12 blocks) → Tasks 4-9; Part B (guard) → Task 10. Marker-bound execution (s3 + gpu) → Tasks 2, 3, 4, 7. Spike #1 → Task 1. `pyproject.toml` s3 marker → Task 2. All three spec spikes are addressed: #1 in Task 1; #2 (markdown-exec tolerance of `markers=`) is implicitly verified by `hatch run docs:build` — **add a build check**: see Task 11 Step 1 note below; #3 (moto teardown) handled by the fixture's `finally` block in Task 3 Step 5.
-- **Spike #2 verification:** `markers=` and `reason=`/`exec="false"` are unknown attributes to markdown-exec; it ignores unrecognized prefix settings and only acts on `exec="true"`. Confirm by running `hatch run docs:build` once after Task 11 and checking it succeeds and that the gpu/s3 blocks render as static source. If the build errors on unknown attributes, fall back to the per-session marker map (spec fallback for spike #2).
-- **The 12 blocks, accounted for:** quick-start S3 (T4), perf×2 config (T5), arrays config (T5), cli (T6), gpu (T7), contributing exec=on typo + pseudocode (T8), data_types transcript (T8), custom_dtype include (T8), v3_migration wrong-import (T8), perf dask (T9). = 12. ✓
-- **Naming consistency:** `_session_params`, `_markers_for`, `docs_s3_backend`, `test_no_unvalidated_blocks`, `S3_BUCKET="example-bucket"` (matches the URL in the T4 block) used consistently across tasks.
diff --git a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md b/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
deleted file mode 100644
index cbbe689994..0000000000
--- a/docs/superpowers/specs/2026-05-29-docs-block-validation-design.md
+++ /dev/null
@@ -1,260 +0,0 @@
-# Design: Close the "silently-unexecuted docs block" gap
-
-**Date:** 2026-05-29
-**Issue:** [zarr-developers/zarr-python#4016](https://github.com/zarr-developers/zarr-python/issues/4016)
-
-## Problem & root cause
-
-Issue #4016 reports invalid code in the docs:
-
-```python
-z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10), dtype="f4")
-```
-
-`create_array` has no `mode` parameter, so this raises `TypeError: unexpected keyword
-argument 'mode'`. The code was wrong because **nothing validated it**: it is a bare
-` ```python ` block, and both the renderer (`markdown-exec`) and the test suite
-(`tests/test_docs.py`, which filters on `settings.get("exec") != "true"`) only act on
-blocks tagged `exec="true"`. Omitting that attribute is a *silent* opt-out from all
-validation.
-
-This is not a one-off. An audit of all docs found **12 of 180** python blocks
-unexecuted, including a second instance of the same failure mode:
-`docs/contributing.md:231` is tagged `exec="on"` (a typo for `"true"`), so a block
-meant to run silently does not.
-
-**Root cause:** validation is opt-in via an easily-mistyped, easily-omitted attribute,
-with no signal when a block opts out.
-
-### Audit of the 12 bare blocks
-
-| Block | Why bare | Disposition |
-|---|---|---|
-| `docs/quick-start.md:134` (S3) | hits real S3 | **Execute, `markers="s3"`** (moto infra, default doctest env) |
-| `docs/user-guide/gpu.md:19` | needs cupy + GPU | **Execute, `markers="gpu"`** (runs in `gputest` env) |
-| `docs/user-guide/performance.md:207` | left bare | **`exec="true"`** (plain `zarr.config.set`) |
-| `docs/user-guide/performance.md:237` | left bare | **`exec="true"`** |
-| `docs/user-guide/performance.md:263` | left bare | **`exec="true"`** (uses dask + a local array path) |
-| `docs/user-guide/arrays.md:622` | left bare | **`exec="true"`** (`zarr.config.set`) |
-| `docs/user-guide/cli.md:48` | left bare | **`exec="true"`** (`zarr.open`; needs a runnable path/store) |
-| `docs/contributing.md:231` | **`exec="on"` typo** | **Fix typo** → `exec="true"` |
-| `docs/contributing.md:15` | pseudocode (`# etc.`) | **Explicit opt-out** + reason |
-| `docs/user-guide/data_types.md:363` | REPL transcript (`<class ...>`) | **Explicit opt-out** + reason |
-| `docs/user-guide/examples/custom_dtype.md:5` | `--8<--` file include | **Explicit opt-out** + reason |
-| `docs/user-guide/v3_migration.md:42` | intentionally-wrong import | **Explicit opt-out** + reason |
-
-(`performance.md:263` and `cli.md:48` need a small adjustment — a memory store or a
-real local path — to be runnable; confirm during implementation.)
-
-## Approach
-
-Two complementary parts.
-
-### Part A — Per-case remediation of the 12 bare blocks
-
-Not one mechanism — a triage. Each block gets the treatment that fits *why* it is not
-executing:
-
-- **Make executable against fakes** — the S3 example, via `markers="s3"`. The marker
-  binds the block to the repo's existing `moto` mock-S3 infra (pattern from
-  `tests/test_store/test_fsspec.py`) so it runs for real in CI with no real-cloud
-  contact. Execution validates the whole write path, not just the signature; `mode="w"`
-  dies by construction. See "Marker-bound execution".
-- **Just turn on** — the config/open blocks (`performance.md` ×3, `arrays.md:622`,
-  `cli.md:48`) are plain runnable API calls; flip them to `exec="true"`.
-- **Fix the typo** — `contributing.md:231` `exec="on"` → `exec="true"`.
-- **Execute, env-gated** — the GPU block, via `markers="gpu"`. It *can* run, but only in
-  the `gputest` env (cupy + GPU hardware), not the default `doctest` env. See
-  "Marker-bound execution".
-- **Explicit opt-out** — blocks that genuinely cannot run anywhere and are not
-  executable Python: REPL transcript, `--8<--` include, intentionally-wrong import,
-  pseudocode. These get a *documented, greppable* opt-out marker carrying a reason.
-
-### Part B — A guard test
-
-So the gap cannot silently reopen: every python block in `docs/` must either be
-`exec="true"` *or* carry the explicit opt-out marker with a reason. A bare or
-mistyped block fails the guard. This would have caught both `mode="w"` and the
-`exec="on"` typo.
-
-### Dropped from scope
-
-The type-checking / markdown-extractor machinery considered earlier. Execution-against-
-fakes strictly dominates type-checking for the cloud case (and the untyped `s3fs`/`cupy`
-imports make strict type-checking least clean exactly where it was wanted most), and the
-guard handles everything else. Proportionate to ~7 genuinely-affected blocks.
-
-## Key insight: doctests are already pytest tests
-
-`tests/test_docs.py::test_documentation_examples` is an ordinary `@pytest.mark.parametrize`d
-pytest test — one case per `(file, session)`. It is not a separate doctest mechanism.
-Therefore everything pytest already provides for gating tests (markers, `-m` selection,
-skips) is available; the design uses it rather than inventing harness concepts.
-
-There are two distinct executors of docs blocks, and conflating them is what made
-marker-bound execution look hard:
-
-- **`markdown-exec` at docs-build time** — runs blocks to render output into the
-  published site. Build runners have no cupy (and the S3 setup is test infra), so a
-  marker-bound block must render as static source here (no build-time execution).
-- **`tests/test_docs.py` at test time** — the validation. This is pytest, and this is
-  where markers live, where infra fixtures bind, and where env-gating happens.
-
-## Two flags: `exec` (render output) vs `test` (validate)
-
-A code block can be *run* for two unrelated reasons, and conflating them breaks the
-build. They are separate fence attributes:
-
-- **`exec="true"`** — markdown-exec executes the block **at docs-build time to render its
-  output** into the published page. This is markdown-exec's own attribute (it hard-codes
-  the name `exec`, see `markdown_exec/_internal/main.py`), so we cannot rename it. Read it
-  as *"execute to render output."*
-- **`test="true"`** — **our** `tests/test_docs.py` harness executes the block **as a
-  validation test**. markdown-exec does not recognize `test=` and ignores it.
-
-Why two: a block that needs special infra to run (GPU/cupy, or S3) must be **validated in
-tests** but must **not run at build** — build runners have no GPU and no moto server, so
-an `exec="true"` GPU block makes `mkdocs build --strict` abort (`ModuleNotFoundError:
-cupy`). Separating the flags lets such a block be `test="true"` (tested) without
-`exec="true"` (so it renders as static source at build, never executed there).
-
-**Harness rule:** a block is collected as a test if `exec="true"` **OR** `test="true"`.
-So existing `exec="true"` example blocks stay tested as before (backward-compatible), and
-test-only blocks add `test="true"` without `exec`.
-
-**The combinations:**
-
-| Block | `exec` | `test` | Effect |
-|---|---|---|---|
-| Tutorial examples (quickstart, config, …) | `true` | — | Run at build (render output); also tested. |
-| GPU / S3 examples | — | `true` | Tested (under markers); rendered static at build. |
-| Non-runnable (transcript, include, wrong-import) | `false`+`reason` | — | Neither; explicit reasoned opt-out. |
-
-**Placement constraint (markdown-exec quirk).** markdown-exec registers a SuperFences
-custom fence for `python`; its validator *rejects* any fence lacking `exec="true"`
-(`exec="false"` and `test="true"` alike — both are "not executed at build"). Established by
-experiment: a rejected python fence positioned **before** an `exec="true"` block disrupts
-markdown-exec's build-time execution of a **later, state-dependent** block (regardless of
-session). Observed concretely: any non-exec python fence inserted before the quickstart
-`ZipStore` write/read pair made the read block fail (`FileNotFoundError` — the write never
-took effect) and `mkdocs build --strict` aborted. The effect only surfaces with a
-cross-block dependency, so it does **not** affect the standalone `exec="true"` blocks in
-`data_types.md`/`performance.md` that already carry `exec="false"` opt-out blocks above
-them.
-
-Because we cannot statically tell which later blocks are state-dependent, the response is
-twofold: (1) **a `test="true"`-only block must come last on its page** (or be the only
-python block, as on `gpu.md`) — a conservative convention enforced by
-`test_test_only_blocks_come_last` for the blocks we author this way; and (2) the
-**authoritative** build-hazard check is `mkdocs build --strict` (the `docs:check` CI job),
-which catches the `exec="false"` case too. The S3 example is placed at the end of
-`quick-start.md` accordingly.
-
-## Marker-bound execution
-
-A block declares the pytest marker it needs via a **fence attribute**. Marker-bound
-blocks are `test="true"` (validated) but **not** `exec="true"` (not build-run), e.g.:
-
-````
-```python test="true" markers="gpu" source="above"
-```python test="true" markers="s3" source="above"
-````
-
-`group_examples_by_session()` parses `markers=` and emits
-`pytest.param(session_key, marks=pytest.mark.<m>)`. The marker then **binds the case to
-whatever that marker means** — and the two markers mean different things, which is the
-point of unifying the model rather than special-casing each:
-
-- **`gpu` — env-gate.** A registered marker does **not** auto-skip under plain `pytest`
-  (markers only *filter* when you pass `-m`). The repo's convention is
-  `pytest.importorskip("cupy")` in the test body (cf. `tests/conftest.py`), so the harness
-  calls `importorskip("cupy")` for gpu-marked docs cases: in the default `doctest` env the
-  case is **skipped** (no cupy), and `pytest -m gpu` in the `gputest` env runs it on real
-  cupy. The block is `test="true"` (not `exec="true"`), so it is never run at build.
-
-- **`s3` — infra-binding.** A new `s3` marker (must be registered in the `markers`
-  table). An autouse-style fixture keyed on the marker stands up the `moto` server and
-  registers a default endpoint, so an `s3`-marked docs case runs against the fake S3
-  with no real-cloud contact. Because the infra is just pip deps already present in the
-  `doctest` env (`s3fs`, `moto[s3,server]`), the case **runs in the default doctest
-  run** — the marker binds infra, it does not gate the case out. The moto/endpoint
-  plumbing lives in named pytest fixtures, not a hidden markdown setup block.
-
-Both blocks therefore follow one rule: *declare the marker; the harness binds the marker
-to the infra/env it needs.* The asymmetry is in what each marker resolves to (gpu →
-hardware env, s3 → fixture), not in the declaration mechanism.
-
-## Components & data flow
-
-**`docs/` markdown** — source of truth. Each python block is in one of three declared
-states (see the two-flags table above):
-
-1. `exec="true"` and/or `test="true"` (optionally `+ markers="<m>"`) — validated, by
-   build-render and/or by the test harness.
-2. `exec="false"` with a `reason="..."` — explicit, documented opt-out.
-3. anything else (bare, `exec="on"`, …) — **illegal**, fails the guard.
-
-The opt-out form is `exec="false" reason="..."`: explicit, greppable, carries a
-human-readable reason, and is not executed by markdown-exec at build time.
-
-**`tests/test_docs.py`** — already-parametrized pytest harness. Changes:
-
-- `group_examples_by_session()` parses the `markers=` attribute and emits
-  `pytest.param(..., marks=pytest.mark.<m>)` so marker-binding rides existing marker
-  machinery.
-- A marker-keyed fixture for `s3` that stands up the `moto` server and registers a
-  default endpoint (pattern lifted from `tests/test_store/test_fsspec.py`), applied to
-  `s3`-marked docs cases.
-- New guard test `test_no_unvalidated_blocks` — walks every python block in `docs/`,
-  asserts each is `exec="true"` or carries the explicit opt-out marker. Fails on
-  bare/typo'd blocks.
-
-**`pyproject.toml`** — register the new `s3` marker in the `markers` table (alongside
-`gpu`).
-
-**`docs/quick-start.md` S3 block** — gains `markers="s3"`. The visible code stays a clean
-`create_array("s3://...")`; the moto server and default-endpoint registration are
-supplied by the `s3` fixture, not by an in-markdown setup block.
-
-## Risks & spikes (resolve during implementation; do not guess)
-
-1. **Default S3 endpoint without `storage_options`.** Existing tests always pass
-   `endpoint_url` explicitly (`test_fsspec.py:131`). Confirm the `s3` fixture can register
-   a *process-wide* default endpoint (via `fsspec.config` or `AWS_ENDPOINT_URL`) so the
-   visible `create_array("s3://...")` works clean with no `storage_options`. **Fallback:**
-   show the honest `storage_options={"endpoint_url": ...}` form in the visible block.
-
-2. **`markdown-exec` + unknown `markers=` attribute.** Confirm the build-time renderer
-   ignores `markers=` (or is told to), and that marker-bound blocks render as static
-   source in the published site (render source only, no build-time execution — the build
-   has neither cupy nor the moto fixture). **Fallback:** a per-session marker map in
-   `test_docs.py`, keeping markdown untouched.
-
-3. **moto teardown / loop affinity in the docs session.** `s3fs`/`aiobotocore` finalizers
-   are noisy at teardown and s3fs instances bind to the event loop they were created on
-   (see the filterwarnings note in `pyproject.toml` and the loop comments in
-   `test_fsspec.py`). Ensure the docs `s3` fixture starts/stops moto cleanly and does not
-   leak across sessions/tests.
-
-## Testing the change
-
-- Guard test is self-validating: after remediation, the full docs suite passes with zero
-  bare/typo'd blocks.
-- Negative check: temporarily introduce a bare block, confirm the guard fails, remove it.
-- S3 block: `hatch run doctest:test` runs it green against moto in the default doctest
-  env (the `s3` marker binds the fixture; it is not gated out).
-- GPU block: `pytest -m gpu` in `gputest` executes it; the default `doctest` run reports
-  it **skipped**, not absent.
-
-## Out of scope
-
-- Type-checking machinery / markdown extractor.
-- The 168 already-executing blocks.
-- Broad docs rewrites beyond the 12 bare blocks.
-
-## Upstream
-
-[zarr-developers/zarr-python#4017](https://github.com/zarr-developers/zarr-python/issues/4017)
-captures the root-cause framing (silent opt-out hides bugs; `mode="w"` and `exec="on"` as
-two instances) and the Part B guard proposal for community discussion, independent of the
-immediate fix in #4016.
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 9561d0ad05..0638694b15 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -3,8 +3,9 @@
 
 This module uses pytest-examples to validate Python code examples in the docs. A block is
 validated if it renders output at build (exec="true") or is explicitly marked for testing
-(test="true"); see the two-flags discussion in
-docs/superpowers/specs/2026-05-29-docs-block-validation-design.md. The test_no_unvalidated_blocks
+(test="true"). The two flags are separate on purpose: exec= drives markdown-exec's
+build-time rendering, while test= lets a block be validated without being run at build
+(e.g. gpu/s3 examples the build environment cannot run). The test_no_unvalidated_blocks
 guard ensures every python block declares one of those, or an explicit exec="false" opt-out
 with a reason, so a block can never silently skip validation.
 """

From 84ad5a213640db0457479d5a260686a07a9c00aa Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 15:31:46 +0200
Subject: [PATCH 23/25] test: drop superpowers-docs exclusion now that those
 files are gone
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The _is_published_docs helper existed only to skip docs/superpowers/ design-doc
caches; those files are no longer in the repo, so the helper and its three call
sites were dead code referencing a nonexistent directory. Remove them — every
find_examples(DOCS_ROOT) result is now published docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/test_docs.py | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/tests/test_docs.py b/tests/test_docs.py
index 0638694b15..803d925b4a 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -55,19 +55,6 @@ def _is_tested(settings: dict[str, str]) -> bool:
     return settings.get("exec") == "true" or settings.get("test") == "true"
 
 
-def _is_published_docs(path: str) -> bool:
-    """Whether a code example belongs to the published documentation. docs/superpowers/
-    holds design-doc caches (plans/specs) that are not in the mkdocs nav; both the test
-    collector and the guard exclude it so they agree on what counts as real docs."""
-    try:
-        rel = Path(path).relative_to(DOCS_ROOT)
-    except ValueError:
-        # Path is outside DOCS_ROOT (e.g. a tmp_path fixture in unit tests); treat it as
-        # in-scope so such tests exercise the normal path.
-        return True
-    return not (rel.parts and rel.parts[0] == "superpowers")
-
-
 def _session_params(root: Path) -> list[Any]:
     """Group tested examples (exec="true" or test="true") by (file, session) and emit one
     pytest.param per session, carrying the union of markers declared by that session's
@@ -76,8 +63,6 @@ def _session_params(root: Path) -> list[Any]:
     marks_by_session: defaultdict[tuple[str, str], set[str]] = defaultdict(set)
 
     for example in find_examples(str(root)):
-        if not _is_published_docs(str(example.path)):
-            continue
         settings = example.prefix_settings()
         if not _is_tested(settings):
             continue
@@ -164,8 +149,6 @@ def test_no_unvalidated_blocks() -> None:
     A separate placement constraint is enforced by test_test_only_blocks_come_last."""
     offenders: list[str] = []
     for example in find_examples(str(DOCS_ROOT)):
-        if not _is_published_docs(str(example.path)):
-            continue
         rel = Path(example.path).relative_to(DOCS_ROOT)
         settings = example.prefix_settings()
         exec_val = settings.get("exec")
@@ -212,8 +195,6 @@ def test_test_only_blocks_come_last() -> None:
     test_only: defaultdict[str, list[int]] = defaultdict(list)
     exec_lines: defaultdict[str, list[int]] = defaultdict(list)
     for example in find_examples(str(DOCS_ROOT)):
-        if not _is_published_docs(str(example.path)):
-            continue
         settings = example.prefix_settings()
         path = str(example.path)
         if settings.get("exec") == "true":

From cfa792fc1a4b7a2698ad363d519e5792f44b3b5f Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 15:46:34 +0200
Subject: [PATCH 24/25] test: share one moto S3 backend across fsspec and docs
 tests

Both test_store/test_fsspec.py and test_docs.py stood up their own moto
ThreadedMotoServer (ports 5555 and 5556). Extract a single session-scoped
`moto_server` fixture + MOTO_ENDPOINT_URL constant into tests/conftest.py and have
both consumers reuse it:

- test_fsspec.py: s3_base now returns the shared moto_server; its per-test `s3`
  fixture (bucket "test", explicit endpoint, event-loop cleanup) is unchanged.
- test_docs.py: docs_s3_backend depends on moto_server and adds only its
  docs-specific layer (process-wide AWS_ENDPOINT_URL + "example-bucket"); it no
  longer owns the server lifecycle.

One server now serves the whole session; each consumer creates and moto-api-resets
its own bucket. Verified: test_fsspec.py (96 passed), the docs suite (58 passed),
both together in one session (154 passed), and the full standard suite in the
optional env (5956 passed); prek clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/conftest.py               | 28 ++++++++++++++++++
 tests/test_docs.py              | 51 ++++++++++++++-------------------
 tests/test_store/test_fsspec.py | 30 +++++++------------
 3 files changed, 60 insertions(+), 49 deletions(-)

diff --git a/tests/conftest.py b/tests/conftest.py
index 3515acace0..3402eb7063 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -531,3 +531,31 @@ def deep_nan_equal(a: object, b: object) -> bool:
     if isinstance(a, Sequence) and isinstance(b, Sequence):
         return all(deep_nan_equal(a[i], b[i]) for i in range(len(a)))
     return nan_equal(a, b)
+
+
+# Shared mock-S3 (moto) backend. A single server is reused across the whole test session by
+# every test that needs S3 -- both the fsspec store tests and the documentation examples --
+# instead of each module standing up its own. Consumers create their own buckets and choose
+# how the endpoint reaches the client (explicit storage_options vs. the AWS_ENDPOINT_URL
+# env var) on top of this fixture.
+MOTO_SERVER_PORT = 5555
+MOTO_ENDPOINT_URL = f"http://127.0.0.1:{MOTO_SERVER_PORT}/"
+
+
+@pytest.fixture(scope="session")
+def moto_server() -> Generator[str, None, None]:
+    """Start a session-scoped moto S3 server and yield its endpoint URL.
+
+    importorskip lives inside the fixture so moto is only required when a test actually
+    requests an S3 backend, not for the whole test session."""
+    moto_server_mod = pytest.importorskip("moto.moto_server.threaded_moto_server")
+
+    server = moto_server_mod.ThreadedMotoServer(ip_address="127.0.0.1", port=MOTO_SERVER_PORT)
+    server.start()
+    # moto needs *some* credentials present; use throwaway values if the environment has none.
+    os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
+    os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
+    try:
+        yield MOTO_ENDPOINT_URL
+    finally:
+        server.stop()
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 803d925b4a..02dca225b0 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -78,49 +78,40 @@ def _session_params(root: Path) -> list[Any]:
     return params
 
 
-S3_PORT = 5556
-S3_ENDPOINT = f"http://127.0.0.1:{S3_PORT}/"
 S3_BUCKET = "example-bucket"
 
 
 @pytest.fixture
-def docs_s3_backend() -> Generator[None, None, None]:
-    """Stand up a moto mock-S3 server and set a process-wide default endpoint so docs
-    blocks can use a bare s3:// URL with no storage_options (see spike in plan Task 1)."""
-    moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
+def docs_s3_backend(moto_server: str) -> Generator[None, None, None]:
+    """Point docs S3 examples at the shared moto server (tests/conftest.py) via a
+    process-wide AWS_ENDPOINT_URL, so a block can use a bare s3:// URL with no
+    storage_options (see spike in the design notes). The server lifecycle belongs to the
+    session-scoped `moto_server` fixture; this fixture only adds the docs-specific
+    endpoint env var and a fresh bucket, and restores both on teardown."""
     s3fs = pytest.importorskip("s3fs")
     botocore = pytest.importorskip("botocore")
     requests = pytest.importorskip("requests")
 
-    # Save every env var we mutate so teardown can restore the prior process state.
-    env_keys = ("AWS_ENDPOINT_URL", "AWS_SECRET_ACCESS_KEY", "AWS_ACCESS_KEY_ID")
-    prev_env = {key: os.environ.get(key) for key in env_keys}
-    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=S3_PORT)
-    server.start()
+    prev_endpoint = os.environ.get("AWS_ENDPOINT_URL")
+    os.environ["AWS_ENDPOINT_URL"] = moto_server
+
+    session = botocore.session.Session()
+    client = session.create_client("s3", endpoint_url=moto_server, region_name="us-east-1")
+    client.create_bucket(Bucket=S3_BUCKET)
+    client.close()
+    s3fs.S3FileSystem.clear_instance_cache()
     try:
-        os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "foo")
-        os.environ.setdefault("AWS_ACCESS_KEY_ID", "foo")
-        os.environ["AWS_ENDPOINT_URL"] = S3_ENDPOINT
-
-        session = botocore.session.Session()
-        client = session.create_client("s3", endpoint_url=S3_ENDPOINT, region_name="us-east-1")
-        client.create_bucket(Bucket=S3_BUCKET)
-        client.close()
-        s3fs.S3FileSystem.clear_instance_cache()
         yield
     finally:
-        # Cleanup must always run, even if the moto-api reset POST fails: stopping the
-        # server frees the fixed port and restoring the env avoids leaking state (and a
-        # stale AWS_ENDPOINT_URL) into the rest of the session.
+        # Reset moto state and restore AWS_ENDPOINT_URL; the shared server keeps running
+        # (the moto_server fixture stops it at session end).
         try:
-            requests.post(f"{S3_ENDPOINT}moto-api/reset")
+            requests.post(f"{moto_server}moto-api/reset")
         finally:
-            for key, value in prev_env.items():
-                if value is None:
-                    os.environ.pop(key, None)
-                else:
-                    os.environ[key] = value
-            server.stop()
+            if prev_endpoint is None:
+                os.environ.pop("AWS_ENDPOINT_URL", None)
+            else:
+                os.environ["AWS_ENDPOINT_URL"] = prev_endpoint
 
 
 def test_markers_attribute_is_parsed(tmp_path: Path) -> None:
diff --git a/tests/test_store/test_fsspec.py b/tests/test_store/test_fsspec.py
index 142cb3b00d..afda534e49 100644
--- a/tests/test_store/test_fsspec.py
+++ b/tests/test_store/test_fsspec.py
@@ -1,7 +1,6 @@
 from __future__ import annotations
 
 import json
-import os
 import re
 from typing import TYPE_CHECKING, Any
 
@@ -10,6 +9,7 @@
 from packaging.version import parse as parse_version
 
 import zarr.api.asynchronous
+from tests.conftest import MOTO_ENDPOINT_URL
 from zarr import Array
 from zarr.abc.store import OffsetByteRequest
 from zarr.core.buffer import Buffer, cpu, default_buffer_prototype
@@ -50,31 +50,23 @@
 fsspec = pytest.importorskip("fsspec")
 s3fs = pytest.importorskip("s3fs")
 requests = pytest.importorskip("requests")
-moto_server = pytest.importorskip("moto.moto_server.threaded_moto_server")
-moto = pytest.importorskip("moto")
+# Skip this module entirely when moto is absent; the server itself comes from the shared
+# `moto_server` fixture in tests/conftest.py.
+pytest.importorskip("moto")
 botocore = pytest.importorskip("botocore")
 
 # ### amended from s3fs ### #
 test_bucket_name = "test"
 secure_bucket_name = "test-secure"
-port = 5555
-endpoint_url = f"http://127.0.0.1:{port}/"
+# The moto server itself is the session-scoped `moto_server` fixture in tests/conftest.py;
+# this module reuses its endpoint rather than standing up its own server.
+endpoint_url = MOTO_ENDPOINT_URL
 
 
-@pytest.fixture(scope="module")
-def s3_base() -> Generator[None, None, None]:
-    # writable local S3 system
-
-    # This fixture is module-scoped, meaning that we can reuse the MotoServer across all tests
-    server = moto_server.ThreadedMotoServer(ip_address="127.0.0.1", port=port)
-    server.start()
-    if "AWS_SECRET_ACCESS_KEY" not in os.environ:
-        os.environ["AWS_SECRET_ACCESS_KEY"] = "foo"
-    if "AWS_ACCESS_KEY_ID" not in os.environ:
-        os.environ["AWS_ACCESS_KEY_ID"] = "foo"
-
-    yield
-    server.stop()
+@pytest.fixture
+def s3_base(moto_server: str) -> str:
+    """Reuse the shared session-scoped moto server (see tests/conftest.py)."""
+    return moto_server
 
 
 def get_boto3_client() -> botocore.client.BaseClient:

From f6c587a701d4572fdf03b6588a0f81307c7c8af7 Mon Sep 17 00:00:00 2001
From: Davis Vann Bennett <davis.v.bennett@gmail.com>
Date: Fri, 29 May 2026 15:58:25 +0200
Subject: [PATCH 25/25] docs: document the exec vs test code-block distinction
 for contributors

The contributing guide explained exec="true" but said nothing about test="true",
the exec="false"+reason opt-out, the guard that requires one of them, or the
placement constraint -- so a contributor could write a bare block and hit the guard
with no explanation. Add a "Validating code blocks: exec vs test" section covering:

- exec="true" (build-render) vs test="true" (validate-only) and when to use each
- the exec="false" reason="..." opt-out and the test_no_unvalidated_blocks guard
- markers="gpu"/"s3" for infra-bound blocks
- the placement rule (test-only blocks come last) + --strict CI

Attribute examples are shown as inline code rather than nested ```python fences, so
pytest-examples' find_examples never mistakes the teaching examples for real blocks
(verified: only the two genuine blocks in contributing.md are collected).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/contributing.md | 58 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/docs/contributing.md b/docs/contributing.md
index a37768b815..e4906f6db5 100644
--- a/docs/contributing.md
+++ b/docs/contributing.md
@@ -253,6 +253,64 @@ renders as:
 print("Hello world")
 ```
 
+#### Validating code blocks: `exec` vs `test`
+
+Every Python code block in the documentation is checked by a test
+(`tests/test_docs.py`) so that examples cannot quietly rot — the bug that motivated
+this was an example calling `zarr.create_array(..., mode="w")`, an argument that does
+not exist, which went unnoticed because nothing ran it. A block declares *how* it is
+validated using one of two independent attributes:
+
+  - **`exec="true"`** — Markdown Exec runs the block **at docs-build time to render its
+    output** into the page. This is the attribute described above; it is also what the
+    test suite executes. Use it for ordinary examples whose output should appear in the
+    docs.
+  - **`test="true"`** — the block is **run by the test suite only**, *not* at build time.
+    Use this for an example that should be validated but cannot run in the docs-build
+    environment — for example one that needs a GPU or a cloud backend. Markdown Exec
+    leaves a `test="true"` block as a static, syntax-highlighted snippet (it never
+    executes it), while the test suite still runs it (see the marker note below).
+
+A block may carry both (`exec="true" test="true"`), though in practice `exec="true"`
+already implies it is tested, so you rarely need `test="true"` alongside it.
+
+The two attributes are kept separate on purpose: `exec=` controls *build-time rendering*
+and `test=` controls *test-time validation*. Tagging a GPU/cloud example `exec="true"`
+would make `mkdocs build` try to run it on a machine without that infrastructure and fail
+the build; `test="true"` lets it be validated without being built.
+
+##### Opting a block out of validation
+
+A handful of blocks genuinely cannot run and are not executable Python — a REPL
+transcript, a deliberately-incorrect "before" snippet, a `--8<--` file include. Mark
+these explicitly by opening the fence with
+`exec="false" reason="REPL output transcript, not executable source"` (supply a reason
+that fits the block).
+
+`exec="false"` with a non-empty `reason` is an explicit, greppable opt-out. A test
+(`test_no_unvalidated_blocks`) requires **every** Python block to be either `exec="true"`,
+`test="true"`, or `exec="false"` with a reason — so a block can never silently skip
+validation. A bare ` ```python ` fence, or a typo like `exec="on"`, fails that test.
+
+##### Marker-bound blocks (GPU, S3)
+
+A `test="true"` block that needs special infrastructure declares a pytest marker with
+`markers="..."`, which binds it to that infrastructure in the test suite:
+
+  - `markers="gpu"` — run only under `pytest -m gpu` (the GPU CI environment); skipped
+    elsewhere via `importorskip("cupy")`.
+  - `markers="s3"` — run against a mock S3 (moto) backend supplied by a test fixture, so
+    the example can use a bare `s3://…` URL with no test-only connection details on show.
+
+##### Placement of `test="true"` blocks
+
+Because Markdown Exec does not execute a `test="true"` (or `exec="false"`) block, placing
+one *before* an `exec="true"` block on the same page can disrupt the build-time execution
+of that later block. Put `test="true"` blocks **after** all `exec="true"` blocks on the
+page (or on a page where they are the only Python block). The `test_test_only_blocks_come_last`
+test enforces this, and the CI docs build runs with `--strict` so any such breakage fails
+the build rather than passing as a warning.
+
 #### Building documentation without executing code blocks
 
 Sometimes, you may want the documentation to build quicker. You can disable code block execution by commenting out the [markdown-exec plugin](https://github.com/zarr-developers/zarr-python/blob/884a8c91afcc3efe28b3da952be3b85125c453cb/mkdocs.yml#L132) in the mkdocs configuration file. This will make code blocks and cross references render incorrectly (i.e., expect build warnings), but also reduces build time by ~3x. Be sure to undo the commenting out before opening your pull request.