feat(init): add Cloudflare DO agent transport with server-controlled split#1180
feat(init): add Cloudflare DO agent transport with server-controlled split#1180betegon wants to merge 3 commits into
Conversation
…split New reconnecting WebSocket client that runs sentry init against the Durable Object agent, reusing the existing local tools + interactive prompts. Transport is chosen server-side via /route, defaulting to the existing Mastra path, so the canary percentage changes without a CLI release. Older CLIs are unaffected. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Codecov Results 📊❌ Patch coverage is 11.69%. Project has 5392 uncovered lines. Files with missing lines (3)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
- Coverage 81.62% 81.05% -0.57%
==========================================
Files 408 409 +1
Lines 28227 28458 +231
Branches 18384 18569 +185
==========================================
+ Hits 23040 23066 +26
- Misses 5187 5392 +205
- Partials 1886 1894 +8Generated by Codecov Action |
| const base = (args.agentDoUrl ?? INIT_AGENT_DO_URL).replace( | ||
| TRAILING_SLASHES_RE, | ||
| "" | ||
| ); | ||
| const url = `${base}/agents/sentry-init-agent/${runId}`; |
There was a problem hiding this comment.
Server-supplied agentDoUrl used as WebSocket target without scheme validation, risking auth-token exposure over cleartext
resolveTransport() returns agentDoUrl from the /route JSON body with no validation, and runViaAgentDO uses it verbatim to build the WebSocket URL (args.agentDoUrl ?? INIT_AGENT_DO_URL). There is no check that the value uses the secure wss:// scheme. connectOnce then sends the Sentry auth token as an Authorization: Bearer header to whatever host/scheme the URL specifies. If the server returns (or is misconfigured to return) a ws:// URL, the bearer token is transmitted in cleartext and can be captured by an on-path network attacker. This is a downgrade/defense-in-depth weakness independent of server trust. Note the escalated 'arbitrary code execution' concern is largely inherent to the intended design: a server that routes a run to the DO transport already drives local tool execution (run-commands via executeTool) over the legitimate connection, so a compromised/malicious /route server is already fully trusted in this flow; the distinct, avoidable risk here is the lack of a secure-scheme (and ideally host allowlist) check before the token is sent.
Evidence
resolveTransport()(wizard-runner.ts:464-469) casts the/routebodyas { agentDoUrl?: string }and returnsdata.agentDoUrlwith no scheme or host validation.runViaAgentDO(agent-do-runner.ts:627) usesargs.agentDoUrl ?? INIT_AGENT_DO_URLdirectly to form the WS URL; aws://value bypasses TLS.connectOnce(agent-do-runner.ts:387-390) attachesheaders: { Authorization: 'Bearer ${token}' }to the WebSocket regardless of scheme, so a cleartextws://target leaks the token.- The
/routerequest itself is TLS-protected (customFetch over the https-normalized base), so injecting a malicious URL requires a compromised server; that same server can already issuerun-commands, so the RCE angle is not a new privilege — the concrete avoidable defect is the missing wss:// enforcement.
Also found at 3 additional locations
src/lib/init/wizard-runner.ts:466-469src/lib/init/wizard-runner.ts:49src/lib/init/wizard-runner.ts:934
Identified by Warden find-bugs, security-review · YJR-CDF
If the first WebSocket connect failed before `open` (e.g. a transient on the upgrade), the reconnect loop still marked start as sent, so the retry never sent `start` and the run hung. Track whether start was actually sent and emit it on the first socket that opens. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
Adds a second transport for
sentry init: a reconnecting WebSocket client that talks to the new Cloudflare Durable Object agent (server side: getsentry/cli-init-api#187). Which transport a run uses is decided server-side via/route, so the canary split changes without shipping a new CLI. The default and fallback is the existing Mastra path, so this is a no-op for current users until the split is turned up.Older CLIs never call
/route, so they are unaffected — they keep using Mastra.Architecture
This diagram is shared with the server PR.
flowchart TB U["Developer runs: sentry init"] subgraph CLI["CLI (sentry binary, Node)"] PF["preflight: org / project / team + Sentry auth"] RT["resolveTransport(): GET /route"] ADR["agent-do-runner: reconnecting WebSocket client"] LT["local tools + prompts (executeTool, handleInteractive)"] end subgraph NET["Transport"] WS["WebSocket - live and resilient (client pings + reconnect)"] LEGACY["HTTP suspend/resume (Mastra, existing)"] end subgraph SRV["Server: Cloudflare Worker (agents SDK)"] RTE["/route: server-controlled canary split (AGENT_DO_PERCENT)"] DO["SentryInitAgent: Durable Object"] AG["one agent: analyze + gates + implement + verify"] end subgraph INF["Infra and libraries"] SQL["DO embedded SQLite: durable phase + pending (no D1)"] GW["Vercel AI Gateway via ai SDK -> LLM"] SAPI["Sentry API: ensure project + DSN"] MASTRA["Mastra worker + D1 (existing path)"] end U --> PF --> RT RT -->|"decision"| RTE RT -->|"agent-do"| ADR RT -->|"mastra"| LEGACY ADR <-->|"tool-request / prompt / result"| WS WS <--> DO DO --> AG AG -->|"LLM turns"| GW AG -->|"ensure project"| SAPI DO --> SQL ADR -.->|"runs tools locally, returns results"| LT LEGACY <--> MASTRAChanges
agent-do-runner.ts(new): the WebSocket client. Reuses the existingexecuteToolregistry andhandleInteractiveprompts, reconnects on drop (deploys / laptop sleep), and sends client-side WebSocket pings for idle keepalive (Cloudflare auto-answers pong without waking the DO).wizard-runner.ts:resolveTransport()asks the server/routeand obeys;runViaAgentDO()drives the run. The Mastra suspend/resume path is untouched.SENTRY_INIT_AGENT_DO=1|0is a manual override;SENTRY_INIT_AGENT_DO_PERCENTis a local fallback if/routeis unreachable.ws(bundled into the single-file build).Test plan
/verify-sentryconfirmed real errors + traces for node-express and nextjs./routesays so or is unreachable).Paired with getsentry/cli-init-api#187 (server side).