feat(policy): add review substrate and reviewer-agent approval path

## Problem Statement

The agent-driven policy loop in #1062 supports agent-authored policy proposals and human approval through CLI/TUI reviewer surfaces. For the MVP to represent a current SOTA agent policy architecture, it also needs a trusted control-plane review path that can evaluate submitted OpenShell policy chunks and either recommend or record approve/deny decisions with guidance.

The product goal is agent-speed policy evolution for changes that are small, well-scoped, and mechanically/contextually justified. Some policy actions should be eligible for automatic approval when deterministic checks, AI review, and configured approval policy all agree. Larger, ambiguous, conflicting, or high-risk changes should continue to route to a human.

This is not sandbox self-approval. The proposing in-sandbox agent can submit proposals and read outcomes, but it cannot act as its own reviewer, directly approve or deny chunks, bypass configured approval policy, or mutate live policy. The reviewer agent runs in the trusted OpenShell control-plane path, alongside gateway-side prover validation, and writes decisions only through the existing draft chunk approval/rejection APIs with explicit provenance.

This should preserve the existing reviewer UX:

- Approve
- Deny
- Deny with guidance

The reviewer agent should not introduce a parallel action such as “approve recommendation.” It should either annotate the existing review surface or, in an explicitly enabled automatic mode, take the same approve/deny actions a human reviewer can take.

## MVP Scope

This issue is part of the #1062 locked MVP scope. It should carry the smallest useful review primitives and the first control-plane AI reviewer implementation unless a future implementation split becomes obviously necessary.

The MVP needs both kinds of review signal:

- Deterministic control-plane checks: schema validation, static safety checks, prover findings, scope classification.
- AI/control-plane reviewer judgment: whether the requested capability matches the observed denial, whether the proposal suspiciously broadens host/path/binary scope, whether the agent rationale is coherent, and what guidance should be returned on denial.

These signals must be displayed and audited separately. An AI reviewer recommendation is not a proof result; a prover pass is not an intent judgment.

## Simplicity Constraints

Keep this simple enough to understand in one sitting and modify while we experiment.

Prefer mechanisms that compute from general context rather than hand-coded policy advice:

- current policy + proposed merge operations;
- structured deny body and recent shorthand/OCSF audit lines;
- prover/scope-check output;
- concise reviewer prompt/context;
- concise reviewer decision + reason + guidance.

Avoid building a heavy evidence ontology, approval DSL, normalized review database, or service-specific intent taxonomy for the MVP. Reuse existing `PolicyChunk`/draft chunk state, existing `validation_result` and `rejection_reason` style strings, and existing approval/rejection paths where possible. Add new proto or database fields only when the simple path cannot support a real experiment.

The design should feel more like the shorthand OCSF log path than a full-fidelity event-feed parser: enough structure for software to route decisions, enough plain text for humans and AI reviewers to reason over, and no more.

## Minimal Review Primitive

The primitive to build is:

> A pending capability request is reviewed by small control-plane checks and reviewers that each return a concise verdict, and OpenShell applies a simple configured approval rule to decide approve, deny, or needs-human.

This unlocks realistic experimentation with what can be automated through intent-based policy, formal methods, and AI review. It should support:

- **Proposal context:** assembled on demand from existing chunk fields, proposed operations, current effective policy, denied-request evidence, recent audit context, and claimed intent.
- **Deterministic verdict:** concise result text from schema validation, prover result, and simple scope checks.
- **Reviewer verdict:** small JSON shape with stable fields only: `decision`, `reason`, optional `guidance`, optional `evidence_refs`.
- **Approval rule:** readable code or simple config, not a new policy language. Examples: auto-deny high/critical prover failures; needs-human for L4 or wildcard scope; auto-approve only when exact observed L7 request match + prover pass + AI approve + mode permits.
- **Provenance:** existing approval/rejection/audit paths identify whether the decision came from a human or reviewer agent.
- **Replay fixtures:** canned proposals that can be run through prover + AI review + approval rule so we can learn which approvals are safe to automate.

## Implementation Guidance

Build this immediately after #1097 wires in prover validation and before larger reviewer-inbox polish. The order should be:

1. Produce deterministic evidence for each proposal (#1097).
2. Assemble minimal review context from existing chunk/policy/log state in this issue.
3. Add the small reviewer verdict shape and simple approval rule.
4. Plug in a control-plane AI reviewer using that context/verdict shape.
5. Expose advisory recommendations first, then automatic approve/deny only when explicitly enabled and permitted by the approval rule.
6. Render the richer evidence in #1098.

Avoid creating additional GitHub issues unless a tiny persistence/proto addition naturally becomes an independently reviewable PR. Do not create a separate planning issue for an abstract substrate.

## Proposed Design

Add an OpenShell-native reviewer-agent path for pending policy chunks.

Suggested flow:

1. Main agent hits `policy_denied`.
2. Main agent submits a policy proposal to `policy.local`.
3. Gateway stores the proposal as a pending draft chunk.
4. Gateway assembles a concise review context from existing proposal, policy, prover, and recent log state.
5. Gateway/control-plane reviewer receives structured context:
   - sandbox name
   - proposed host, port, protocol, method/path, binary
   - agent rationale and claimed intent
   - validation/prover result
   - recent deny/proposal audit snippets
   - prior rejection guidance for related chunks
6. Reviewer agent returns one of:
   - `approve`
   - `deny`
   - `deny_with_guidance`
   - `needs_human`
7. A simple approval rule combines deterministic evidence, reviewer verdict, and configured mode into one of:
   - advisory recommendation only
   - automatic approve
   - automatic deny with guidance
   - needs human
8. If automatic reviewer mode is explicitly enabled and the approval rule permits it, OpenShell records the approve/deny decision with reviewer-agent provenance through the existing draft chunk paths.
9. If advisory reviewer mode is enabled, the TUI/CLI shows the reviewer recommendation, but the human still chooses Approve, Deny, or Deny with guidance.
10. Main agent’s `/wait` wakes and continues or redrafts.

## UX Direction

The TUI should keep the same primary actions:

```text
[a] Approve
[d] Deny
[g] Deny with guidance
```

Reviewer-agent output appears as context, not as a new button:

```text
Reviewer agent: recommends deny with guidance
Reason: proposed path is broader than the denied request. Scope to /repos/org/repo/contents/demo-runs/<file>.md only.
```

In automatic mode, the event should read as a real decision:

```text
Denied by reviewer agent
Guidance: scope to the exact GitHub Contents API path only.
```

## Security Requirements

- Reviewer agent runs in the trusted OpenShell control-plane path, not inside the proposing sandbox.
- The proposing in-sandbox agent cannot act as its own reviewer, directly approve or deny chunks, bypass approval policy, or mutate live policy.
- Reviewer agent must approve based on structured scope, not only natural-language rationale.
- Broad host/path/binary proposals should be denied or escalated to a human.
- Reviewer decisions must be logged with provenance.
- Reviewer agent must not receive raw secrets.
- Denial guidance must round-trip to the main agent through `/wait`.
- Reviewer-agent denials should instruct the main agent not to pursue the same access via workaround or policy circumvention.
- Add loop protection/backoff for repeated denials on the same proposal shape.
- Human review remains available.
- Automatic reviewer mode must be explicitly enabled and constrained by approval policy; advisory mode remains available.
- Approval policy must preserve a needs-human outcome when deterministic and AI signals conflict or are incomplete.

## Relationship To Existing Work

Parent: #1062

Related:

- #1097 — persist and validate agent policy proposal operations
- #1098 — render policy proposal inbox metadata
- #1099 — suggest L7 scoping for known REST hosts
- #1435 — define agent-oriented policy modes and safety levels

This issue is about the reviewer path and minimal review primitives for submitted policy chunks. It is not an LLM PolicyAdvisor replacement and should not generate proposals on behalf of the main agent.

## Definition of Done

- [ ] Assemble reviewer context from existing chunk, policy, prover, denial, and log state.
- [ ] Keep reviewer context concise and inspectable; avoid a new evidence ontology.
- [ ] Define reviewer-agent decision schema with only stable fields: decision, reason, optional guidance, optional evidence refs.
- [ ] Define simple approval-rule outcomes for advisory-only, automatic approve, automatic deny, and needs-human.
- [ ] Define advisory vs automatic reviewer modes.
- [ ] Add reviewer-agent provenance for approvals and denials using existing draft chunk paths where possible.
- [ ] Ensure reviewer decisions can only run through trusted control-plane paths, never from the proposing sandbox.
- [ ] TUI/CLI can show reviewer recommendation without adding a new primary action.
- [ ] Automatic mode can record approve/deny decisions through the existing draft chunk paths when approval policy permits it.
- [ ] Rejections include guidance that round-trips to the main agent through `/wait`.
- [ ] OCSF events distinguish human approval from reviewer-agent approval/denial.
- [ ] Add replay fixtures for canned proposals through prover + AI review + approval rule.
- [ ] Docs explain reviewer-agent mode, risks, trust boundary, approval policy, and human override semantics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(policy): add review substrate and reviewer-agent approval path #1434

Problem Statement

MVP Scope

Simplicity Constraints

Minimal Review Primitive

Implementation Guidance

Proposed Design

UX Direction

Security Requirements

Relationship To Existing Work

Definition of Done

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(policy): add review substrate and reviewer-agent approval path #1434

Description

Problem Statement

MVP Scope

Simplicity Constraints

Minimal Review Primitive

Implementation Guidance

Proposed Design

UX Direction

Security Requirements

Relationship To Existing Work

Definition of Done

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions