Enforce in-VM execution for all CLI tool tests#44
Merged
NathanFlurry merged 23 commits intomainfrom Mar 22, 2026
Merged
Conversation
…leDynamicallyCallback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nverting ESM to CJS) - Detect ESM files (.mjs, import/export syntax) in kernel-runtime.ts and route them to V8's native module system (run mode) instead of CJS exec - Add `esm` option to ExecOptions for explicit ESM mode selection - Remove transformDynamicImport from async loadFile handler since V8 handles import() natively via dynamic_import_callback (US-023) - Apply env/cwd/stdin overrides in run mode (previously exec-only) - Register HostInitializeImportMetaObjectCallback in Rust sidecar to populate import.meta.url for ESM modules - Use raw (unwrapped) filesystem for module resolution bridge handlers so V8's internal module loading bypasses user-level permissions - Add tests: ESM execution, CJS compatibility, static imports, import.meta.url, dynamic import in ESM mode
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add networkAdapter option to NodeRuntimeOptions, stream/promises builtin, /v regex graceful degradation, polyfill named ESM re-exports, comprehensive BUILTIN_NAMED_EXPORTS, __exportStar CJS detection, and rewrite pi-headless.test.ts for kernel.spawn() in-VM execution. Pi partially loads in-VM but hits cascading ESM module compatibility issues. Story remains passes: false pending broader CJS/ESM interop work.
Major sandbox bridge fixes enabling in-VM ESM execution of complex Node.js apps: - Fix _resolveModule to use ESM export conditions (import mode) for V8 module system - Fix polyfill double-wrapping in _loadFile (bundlePolyfill returns IIFE) - Add esbuild __export() pattern to CJS named export extraction - Fix CJS wrapper const→let for exports reassignment (ajv compat) - Add url module static wrapper with correct fileURLToPath/pathToFileURL - Add global = globalThis alias for CJS compat - Add tty, net, path (posix/win32) to BUILTIN_NAMED_EXPORTS - Fix stdin end event for non-TTY (empty stdin emits end on resume) - Add AbortSignal.addEventListener/removeEventListener no-op stubs - Augment crypto polyfill with bridge-backed randomUUID - Add stdout/stderr write callback support and writableLength - Add Response.body ReadableStream to bridge fetch - Add SSRF bypass for localhost in test network adapter - Test uses ANTHROPIC_BASE_URL + allowAll permissions for in-VM Pi
Four bridge compatibility fixes enable Pi to boot and produce LLM-backed
output running inside the sandbox VM via kernel.spawn():
1. TextDecoder subarray fix (execution-driver.ts): V8_POLYFILLS TextDecoder
ignored byteOffset/byteLength of Uint8Array views, causing the Anthropic
SDK's LineDecoder to return corrupted SSE event lines.
2. Fetch Headers serialization (bridge/network.ts): The SDK passes Headers
instances (not plain objects) to fetch. JSON.stringify(Headers) produces
{} — normalize to plain Record before serialization.
3. Response body async iterator (bridge/network.ts): Add Symbol.asyncIterator
and Promise.resolve-based reader (not async function) to minimize microtask
overhead for the SDK's ReadableStreamToAsyncIterable.
4. V8 event loop microtask drain (session.rs): After the main event loop
exits (all bridge promises resolved), run additional microtask checkpoints
in a loop, re-entering the event loop if new bridge calls are created.
This handles deeply nested async generator yield chains across loaded ESM
modules (e.g., SDK SSE parser).
Test results: 5/6 pass (bash tool test skipped without WASM binaries).
- Fix process.kill(self, SIGWINCH) to dispatch signal handlers instead of exiting — Pi TUI sends SIGWINCH to refresh dimensions on startup - Add _stdinRead to ASYNC_BRIDGE_FNS in V8 sidecar to prevent event loop deadlock when process.stdin.resume() starts the readLoop - Rewrite pi-interactive.test.ts: remove all sandboxSkip/probe logic, use networkAdapter instead of inline fetch patching, ESM mode with PI_MAIN (avoids undici import issues with cli.js), proper env vars - Tests still fail due to additional V8 sidecar crash during Pi TUI init (further sandbox gaps to investigate)
V8 sidecar improvements for interactive TUI support: - sync_call call_id matching to handle interleaved async responses - ResponseReceiver::defer() for non-matching BridgeResponse routing - MODULE_RESOLVE_STATE persists through event loop for dynamic import - V8 crate upgraded to v134 (from v130) - Improved error reporting with stderr in IPC close messages Pi interactive TUI still blocked by V8 microtask checkpoint hang (perform_microtask_checkpoint blocks on TUI render cycles).
Route process.nextTick, queueMicrotask, and setTimeout(fn, 0) through the _scheduleTimer bridge handler instead of V8 microtasks. This prevents infinite microtask loops in V8's perform_microtask_checkpoint() caused by TUI render cycles (Pi's requestRender → nextTick(doRender) → doRender → requestRender pattern). Also increase session thread stack size to 32 MiB for V8 with large module graphs. Pi interactive tests remain blocked by V8 v134 SIGSEGV during TUI initialization — this is a V8 engine-level crash, not a bridge issue.
…che preservation Root cause: V8's native Intl.Segmenter (ICU JSSegments::Create) crashes with SIGSEGV during perform_microtask_checkpoint() when processing TUI render cycles from Pi interactive mode (~1600 modules loaded). Fix: - Add Intl.Segmenter JS polyfill to bridge setupGlobals() covering grapheme/word/sentence granularity (bypasses native ICU crash) - Add inline Segmenter polyfill in pi-interactive.test.ts for snapshot-restored contexts - Preserve MODULE_RESOLVE_STATE module cache across event loop (execute_module no longer clears on success path) - Add update_bridge_ctx() to update bridge pointer without losing cache - Set V8 --stack-size=16384 for deep microtask chains - Support SECURE_EXEC_V8_JITLESS=1 env var for debugging Result: Pi TUI renders, input works, Ctrl+C works, PTY resize works (4/9 tests pass). Remaining: LLM streaming response and clean exit.
… plumbing - Fix setRawMode to disable icrnl (CR→NL conversion) on PTY line discipline - Add icrnl field to LineDisciplineConfig and KernelInterface.ptySetDiscipline - Add _notifyProcessExit bridge handler to flush pending timers and stdin on exit - Register _ptySetRawMode and _notifyProcessExit in V8 SYNC_BRIDGE_FNS - process.exit() now clears JS timers and calls _notifyProcessExit before throwing - Exit tests use grace-period pattern for V8 event loop drain
…ss bridge cheating
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
child_process.spawn/spawnSyncfrom sandbox code to run tools on host (the exact loophole Ralph exploited for OpenCode and Claude Code tests). The ONLY valid pattern iskernel.spawn('node', ['-e', 'import("tool.js")'])where the tool's JS runs inside the V8 isolate.passes: false— Ralph cheated by usingspawnSync('opencode')andspawnSync('claude')through the child_process bridge, which runs the tool binary on the host, not in the V8 sandbox.import().Key policy changes in CLAUDE.md
Test plan