Skip to content

Fix XPC connection failure when home directory is on external volume#1724

Open
majidfeiz wants to merge 2 commits into
apple:mainfrom
majidfeiz:fix/service-registration-external-volume
Open

Fix XPC connection failure when home directory is on external volume#1724
majidfeiz wants to merge 2 commits into
apple:mainfrom
majidfeiz:fix/service-registration-external-volume

Conversation

@majidfeiz

Copy link
Copy Markdown

Problem

When a user's home directory resides on an external or removable volume
(e.g. /Volumes/ex-data/), running container system start results in an
"XPC connection invalid" error and the daemon never becomes reachable.

Reported in: #1721

Root cause: launchctl bootstrap <domain> <plist-path> silently fails
when the plist file is located on an external or removable volume. macOS
(Gatekeeper / AMFI) does not allow Mach service registration from plists
that reside outside the internal system volume. Because ApplicationRoot
defaults to ~/.local/share/container, any user whose home directory is on
an external volume ends up writing apiserver.plist and plugin plists to
that volume — causing all subsequent XPC connections to fail.

Solution

Introduce ServiceRoot: a new root type whose sole responsibility is to
provide a location for launchd plist files. Its default path is:

<NSTemporaryDirectory()>/com.apple.container/

NSTemporaryDirectory() (i.e. FileManager.default.temporaryDirectory) is
always located on the internal system volume on macOS, regardless of where
the user's home directory lives. This guarantees that launchctl bootstrap
succeeds in all configurations.

ServiceRoot follows the exact same pattern as the existing ApplicationRoot,
InstallRoot, and LogRoot types:

Type Env var Default
ApplicationRoot CONTAINER_APP_ROOT ~/.local/share/container
InstallRoot CONTAINER_INSTALL_ROOT <binary>/../../
LogRoot CONTAINER_LOG_ROOT (macOS log facility)
ServiceRoot CONTAINER_SERVICE_ROOT $TMPDIR/com.apple.container

CONTAINER_SERVICE_ROOT is propagated from SystemStart into the
container-apiserver plist environment, so the daemon's PluginLoader
automatically picks it up for plugin plist registration as well — no
changes are required at any other call site.

Changes

  • Sources/ContainerPlugin/ServiceRoot.swift (new)
    Defines ServiceRoot with environmentName, defaultPath, path, and
    pathname statics, consistent with the other root types.

  • Sources/ContainerCommands/System/SystemStart.swift

    • Adds --service-root CLI option (overrides CONTAINER_SERVICE_ROOT).
    • Propagates CONTAINER_SERVICE_ROOT into the plist environment.
    • Writes apiserver.plist under serviceRoot instead of appRoot.
  • Sources/ContainerPlugin/PluginLoader.swift

    • Separates plistResourceRoot (system volume, for plists) from
      pluginResourceRoot (appRoot, for runtime state).
    • Adds serviceRoot: URL? = nil to init — existing callers are
      unaffected; nil falls back to reading CONTAINER_SERVICE_ROOT.
  • Tests/ContainerPluginTests/RootPathTests.swift
    Adds ServiceRootTests (3 tests): default path is absolute, ends with
    com.apple.container, and is located under the temporary directory.

  • Tests/ContainerPluginTests/PluginLoaderTest.swift
    Updates testRegisterWithLaunchdPlistNotUnderAppRoot to pass an explicit
    serviceRoot distinct from appRoot, directly asserting the separation.

Testing

All existing tests pass:

swift test --filter ContainerPluginTests
# Test run with 48 tests in 9 suites passed

Manual verification path (for reviewers with an external-volume home):

  1. container system start — previously failed with XPC connection invalid.
  2. After this fix, the daemon registers successfully and all XPC calls work.

To override the service root manually:

container system start --service-root /private/tmp/my-container-service
# or via env var:
CONTAINER_SERVICE_ROOT=/private/tmp/my-container-service container system start

launchd cannot register Mach services from plists located on external or
removable volumes. When the user's home directory (and therefore the default
ApplicationRoot) resides on such a volume, `launchctl bootstrap` silently
fails, causing the XPC connection to be reported as invalid.

Introduce ServiceRoot, a new root type whose default path resolves to a
subdirectory of FileManager.temporaryDirectory — which macOS always places
on the internal system volume regardless of where the home directory lives.
ServiceRoot follows the same pattern as ApplicationRoot, InstallRoot, and
LogRoot: it can be overridden via CONTAINER_SERVICE_ROOT or the new
--service-root CLI flag on `container system start`.

- Add Sources/ContainerPlugin/ServiceRoot.swift
- Write apiserver.plist under serviceRoot instead of appRoot in SystemStart
- Propagate CONTAINER_SERVICE_ROOT through the plist environment so the
  daemon's PluginLoader also writes plugin plists to the system volume
- Add serviceRoot parameter to PluginLoader.init (nil = read env var)
- Add ServiceRootTests and update PluginLoaderTest to cover the separation

Fixes apple#1721

@kasc0206 kasc0206 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

审查意见 / Review

This is an excellent contribution. The root cause analysis is thorough, and the solution is clean and well-isolated.

优点 / Strengths

  1. Root cause analysis: Correctly identifies that launchctl bootstrap silently fails on external volumes due to Gatekeeper/AMFI restrictions
  2. Solution design: Introducing ServiceRoot as a new root type follows the existing ApplicationRoot/InstallRoot/LogRoot pattern perfectly — consistency with the codebase is excellent
  3. Minimal changes: Only 5 files changed, with clear separation of concerns
  4. Test coverage: Both unit tests (RootPathTests, PluginLoaderTest) properly validate the behavior
  5. Backward compatibility: serviceRoot: URL? = nil defaults mean existing callers are unaffected
  6. Documentation: The doc comments on ServiceRoot clearly explain why the separation exists

小建议 / Nitpicks

  • Consider adding a fallback or warning when NSTemporaryDirectory() itself returns nil or an unexpected path, though this is extremely rare on macOS
  • The plistDir construction serviceRoot.appending("apiserver") vs the PluginLoader using plugin-state subdirectory is intentional for audit separation — consider noting this in a comment

已验证 / Verified

  • Code compiles with swift build
  • All existing ContainerPluginTests pass ✅
  • No breaking API changes to PluginLoader.init (new parameter is optional with default nil)

Great work!

@kasc0206

Copy link
Copy Markdown

Code Review Summary / 代码审查摘要

Strengths / 优点

  1. Accurate root cause: Correctly identifies that launchctl bootstrap silently fails when the plist resides on external/removable volumes due to macOS Gatekeeper/AMFI restrictions
    准确的根因分析:正确识别出 launchctl bootstrap 在外部卷上静默失败的问题
  2. Elegant design: Introducing ServiceRoot as a new root type follows the existing ApplicationRoot/InstallRoot/LogRoot pattern — consistent with codebase conventions
    优雅的设计:新增 ServiceRoot 类型遵循了已有的 root 类型模式
  3. Minimal footprint: Only 5 files changed, clear separation of concerns
    最小改动:仅修改 5 个文件,职责清晰
  4. Proper test coverage: Unit tests in RootPathTests and PluginLoaderTest validate the behavior
    充分的测试覆盖:单元测试验证了行为正确性
  5. Backward compatible: serviceRoot: URL? = nil default means all existing callers work unchanged
    向后兼容:可选参数默认为 nil,不影响现有调用方

Nitpicks / 小建议

  • Consider adding a fallback or warning when NSTemporaryDirectory() returns nil or an unexpected path (rare on macOS but would cause cryptic failures)
    建议考虑 NSTemporaryDirectory() 返回空值时的降级或警告
  • The directory structure (serviceRoot/apiserver/ for the API server plist vs serviceRoot/plugin-state/ for plugin plists) is intentional — consider adding a brief comment explaining this audit separation
    建议添加注释说明 apiserver/ 和 plugin-state/ 子目录的职责区分

Verification / 验证

  • Code compiles with swift build
  • All existing ContainerPluginTests pass ✅
  • No breaking API changes to PluginLoader.init

Great contribution! / 优秀的贡献!

…comment

- ServiceRoot.defaultPath: convert to a closure-based static let so a
  precondition fires if NSTemporaryDirectory() ever returns an empty path,
  making the safety assumption explicit and catchable at startup rather
  than silently producing a broken path.

- SystemStart: expand the comment around plistDir construction to make the
  intentional appRoot/serviceRoot separation explicit, noting both why the
  split exists (external-volume restriction) and how CONTAINER_SERVICE_ROOT
  propagates to the daemon without additional wiring.
@majidfeiz

majidfeiz commented Jun 15, 2026

Copy link
Copy Markdown
Author

Hi @kasc0206,

Thank you for the thorough review and the kind words — really appreciate you taking the time to go through the changes carefully.

I've addressed both of your suggestions:

1. Defensive check for NSTemporaryDirectory()
Converted ServiceRoot.defaultPath from a plain static let to a closure-based initializer that includes a precondition guard. If NSTemporaryDirectory() ever returns an empty path, the process fails at startup with a clear message rather than silently constructing a broken path.

2. Clearer comment around plistDir construction
Expanded the comment in SystemStart to make the intentional appRoot/serviceRoot separation explicit — noting both why the split exists (external-volume launchd restriction) and how CONTAINER_SERVICE_ROOT propagates to the daemon without requiring changes at other call sites.

Both changes are in the latest commit. Let me know if anything else needs adjustment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants