[libcxxabi] Use InitByteFutex for __cxa_guard in WASM Workers by AndreyRepko · Pull Request #26283 · emscripten-core/emscripten

AndreyRepko · 2026-02-17T11:41:00Z

When using -sWASM_WORKERS, __cxa_guard_acquire uses the GlobalMutex
implementation (pthread_mutex_lock + pthread_cond_wait), but libc links
pthread stubs where these are all noops. This is not just a performance
problem — GlobalMutex does non-atomic read-then-write on the init byte
under a noop lock, so two workers can both see UNSET and both become the
initializer (double initialization / undefined behavior).

Switch to the InitByteFutex implementation which uses atomic CAS for
correctness. Wait/wake are no-ops so losers spin in the CAS retry loop
rather than sleeping. Cannot use real memory.atomic.wait32 because it
traps on the main browser thread and there is no libcxxabi-compatible way
to detect the main thread (emscripten_is_main_browser_thread is JS-only).
In practice, contention on a single guard is rare and the spin is bounded
by the static constructor duration (typically sub-microsecond).

We hit this in production with thread_local std::optional<T> (non-trivial
destructor) — first access triggers __cxa_thread_atexit →
static DtorsManager → __cxa_guard_acquire. When multiple WASM Workers
start executing tasks simultaneously, some workers busy-spin on the global
mutex indefinitely, causing timeouts.

Fixes #26277

When using -sWASM_WORKERS, __cxa_guard_acquire uses the GlobalMutex implementation (pthread_mutex_lock + pthread_cond_wait), but libc links pthread stubs where these are all noops. This is not just a performance problem — GlobalMutex does non-atomic read-then-write on the init byte under a noop lock, so two workers can both see UNSET and both become the initializer (double initialization / undefined behavior). Switch to the InitByteFutex implementation which uses atomic CAS for correctness. Wait/wake are no-ops so losers spin in the CAS retry loop rather than sleeping. Cannot use real memory.atomic.wait32 because it traps on the main browser thread and there is no libcxxabi-compatible way to detect the main thread (emscripten_is_main_browser_thread is JS-only). In practice, contention on a single guard is rare and the spin is bounded by the static constructor duration (typically sub-microsecond). Fixes emscripten-core#26277

Copilot

Pull request overview

This PR fixes a critical bug in WASM Workers where __cxa_guard_acquire (used for C++ static local variable initialization) can cause double initialization or indefinite busy-spinning when multiple workers trigger the same static initialization simultaneously.

Changes:

Switch libcxxabi from GlobalMutex to InitByteFutex guard implementation for WASM Workers
Add no-op futex wait/wake functions for Emscripten shared memory contexts
Configure build system to use -D_LIBCXXABI_USE_FUTEX flag for WASM Workers

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
tools/system_libs.py	Add `-D_LIBCXXABI_USE_FUTEX` compiler flag for WASM Workers to select the futex-based guard implementation
system/lib/libcxxabi/src/cxa_guard_impl.h	Implement no-op futex wait/wake functions for Emscripten with shared memory, enabling use of atomic CAS operations for thread-safe initialization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sbc100 · 2026-02-17T20:29:27Z

system/lib/libcxxabi/src/cxa_guard_impl.h

+// (non-atomic read-then-write allows double initialization).
+// InitByteFutex uses atomic CAS for correct single initialization.
+// Wait/wake are no-ops — losers spin in the CAS retry loop.
+// Cannot use memory.atomic.wait32 (traps on the main browser thread).


This seems like something we should be using emscripten_futex_wait and emscripten_futex_wake for.. although I don't think those are currently available in wasm workers.

They probably should be.

It looks like wasm workers does define the emscripten_atomic_wait_u32 and emscripten_atomic_wait_u64 but they are just wrappers arount the __builtin_wasm_memory_atomic_xx functions so cannot be used on the main thread.

@juj @cwoffenden WDYT, should we make the higher level (and safe-to-call) emscripten_futex_wait API available in wasm workers and use it here?

I agree that a locking solution would be useful, especially one that works with the main thread and various workers. I watched the current state of locks bite the devs here new to Emscripten.

It's not something I have time to look at right now (and for the next months).

Copilot AI review requested due to automatic review settings February 17, 2026 11:41

Copilot started reviewing on behalf of AndreyRepko February 17, 2026 11:41 View session

Copilot AI reviewed Feb 17, 2026

View reviewed changes

AndreyRepko mentioned this pull request Feb 17, 2026

__cxa_guard_acquire busy-spins in WASM Workers leads to dead-lock #26277

Open

sbc100 reviewed Feb 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[libcxxabi] Use InitByteFutex for __cxa_guard in WASM Workers#26283

[libcxxabi] Use InitByteFutex for __cxa_guard in WASM Workers#26283
AndreyRepko wants to merge 1 commit intoemscripten-core:mainfrom
AndreyRepko:fix/wasm-workers-cxa-guard-futex

AndreyRepko commented Feb 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

sbc100 Feb 17, 2026

Uh oh!

cwoffenden Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

AndreyRepko commented Feb 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sbc100 Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

cwoffenden Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments