FIX: per-frame watchdog so a nested timed call can't disarm the outer timeout by ursm · Pull Request #423 · rubyjs/mini_racer

ursm · 2026-05-31T07:47:22Z

Summary

The per-Context watchdog state (c->wd — mtx/cv/cancel) was shared by every v8_timedwait frame. When a timed dispatch re-enters Ruby and triggers another timed dispatch on the same Context — e.g. a host function calling eval/call again (and, with the in-flight ES module API, an instantiate resolver calling compile_module) — the inner frame's teardown did:

pthread_mutex_lock(&c->wd.mtx);
c->wd.cancel = 1;
pthread_cond_signal(&c->wd.cv);   // wakes BOTH watchdogs (shared cv)
...
c->wd.cancel = 0;                 // also written outside the mutex (data race)

The shared cancel/cv meant cancelling the inner (usually fast) call's watchdog also woke the outer frame's watchdog, which saw cancel == 1, broke out, and exited — so the outer call then ran with no timeout at all.

This is pre-existing on main; it reproduces with just a timeout and a host fn that re-enters eval:

ctx = MiniRacer::Context.new(timeout: 100)
ctx.attach("quick", proc { ctx.eval("1 + 1") })  # fast nested timed call
ctx.eval("quick(); while (true) {}")             # outer 100ms timeout never fires → spins forever

Fix

Move the watchdog into a stack-allocated struct Watchdog owned by each v8_timedwait frame (its own mtx/cv/cancel/deadline), created when the call arms a timeout and torn down before that frame returns. Cancelling an inner watchdog now touches only the inner state, so it can no longer disarm the outer one. The now-unused c->wd field and its init/teardown are dropped, and the cancel data race is gone (the flag is per-frame and only written under its own mutex). The per-frame condvar uses the same clock as deadline_ms (CLOCK_MONOTONIC off Apple).

Test

test_nested_timed_call_does_not_disarm_outer_timeout runs a fast nested eval from a host fn, then spins in the outer script, and asserts the outer timeout still fires. A safety stop turns a regression into a slow failure instead of a hang: against the shared watchdog it runs to the 5 s safety stop and fails (elapsed ~5s); with this fix it terminates in ~0.1 s.

Known follow-up (out of scope here)

TerminateExecution is isolate-global, so if an outer deadline fires while a nested call is still running, the nested handler's CancelTerminateExecution() can still clear it. Closing that requires nesting-aware termination ownership and is left for a separate change.

🤖 Generated with Claude Code

…imeout The per-Context watchdog state (c->wd: mtx/cv/cancel) was shared by every v8_timedwait frame. When a timed dispatch re-enters Ruby and triggers another timed dispatch on the same Context — a host function (or, with the module API, an instantiate resolver) calling eval/call/compile_module — the inner frame's `cancel = 1; pthread_cond_signal()` also woke the OUTER frame's watchdog, which saw the shared cancel flag, broke out, and exited. The outer call then ran with no timeout at all. (The `cancel = 0` reset was also written outside the mutex, a data race with a still-waiting watchdog.) Move the watchdog state into a stack-allocated `struct Watchdog` owned by each v8_timedwait frame and torn down before it returns, so cancelling an inner watchdog can no longer disarm the outer one. Drops the now-unused c->wd field and its init/teardown. Adds a regression test (host fn does a fast nested eval, then the outer script spins) asserting the outer timeout still fires; it fails against the shared watchdog (the spin runs to the test's safety stop) and passes here in ~0.1s. Note: a separate, narrower subtlety remains — because TerminateExecution is isolate-global, if an outer deadline fires *while* a nested call is running, the nested handler's CancelTerminateExecution can still clear it. That needs nesting-aware termination ownership and is left as a follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: per-frame watchdog so a nested timed call can't disarm the outer timeout#423

FIX: per-frame watchdog so a nested timed call can't disarm the outer timeout#423
ursm wants to merge 1 commit into
rubyjs:mainfrom
ursm:fix/nested-watchdog-timeout

ursm commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ursm commented May 31, 2026

Summary

Fix

Test

Known follow-up (out of scope here)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant