Skip to content

[chore] Fix race in the launcher#21835

Open
cedric-cordenier wants to merge 1 commit intodevelopfrom
fix-race-launcher
Open

[chore] Fix race in the launcher#21835
cedric-cordenier wants to merge 1 commit intodevelopfrom
fix-race-launcher

Conversation

@cedric-cordenier
Copy link
Copy Markdown
Contributor

@cedric-cordenier cedric-cordenier commented Apr 2, 2026

Flagged by @jmank88 , there's a potential race in the launcher where we add to subservices and try to read them at the same time.

@cedric-cordenier cedric-cordenier requested review from a team as code owners April 2, 2026 16:18
Copilot AI review requested due to automatic review settings April 2, 2026 16:18
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

👋 cedric-cordenier, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@cedric-cordenier cedric-cordenier requested a review from jmank88 April 2, 2026 16:21
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

✅ No conflicts with other open PRs targeting develop

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: MEDIUM (concurrency change in core capabilities launcher shutdown/service lifecycle)

This PR addresses a data race in the capabilities launcher by synchronizing access to the subServices slice, which is used to track started sub-services for later shutdown.

Changes:

  • Add a mutex (subServicesMux) to guard concurrent reads/writes to launcher.subServices.
  • Route all subServices appends through a new helper (appendSubService) that takes the write lock.
  • Add locking in Close() to prevent concurrent modification while iterating sub-services.

Comment on lines +242 to 243
defer w.subServicesMux.RUnlock()
for _, s := range w.subServices {
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close() holds subServicesMux.RLock() while calling s.Close() on each sub-service. Calling into external code while holding a lock can lead to lock contention and potential deadlocks; it also blocks concurrent appendSubService calls for the duration of every sub-service close. Consider copying w.subServices under the lock, releasing the lock, and then iterating/closing the snapshot.

Suggested change
defer w.subServicesMux.RUnlock()
for _, s := range w.subServices {
subServices := make([]services.ServiceCtx, 0, len(w.subServices))
subServices = append(subServices, w.subServices...)
w.subServicesMux.RUnlock()
for _, s := range subServices {

Copilot uses AI. Check for mistakes.
Comment on lines +1017 to +1021
func (w *launcher) appendSubService(svc services.Service) {
w.subServicesMux.Lock()
defer w.subServicesMux.Unlock()
w.subServices = append(w.subServices, svc)
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces new synchronization around subServices, but there isn't a regression test covering concurrent Close() vs. sub-service registration. Since the repo runs Go race tests, consider adding a unit test that concurrently calls appendSubService/receiver start paths and Close() to ensure this race stays fixed (and to avoid future refactors reintroducing it).

Copilot generated this review using guidance from repository custom instructions.
@cl-sonarqube-production
Copy link
Copy Markdown

@trunk-io
Copy link
Copy Markdown

trunk-io bot commented Apr 2, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants