Skip to content

fix: treat container exit code 0 as success in --wait mode#13661

Open
glours wants to merge 1 commit intomainfrom
wait-exit-init-containers
Open

fix: treat container exit code 0 as success in --wait mode#13661
glours wants to merge 1 commit intomainfrom
wait-exit-init-containers

Conversation

@glours
Copy link
Contributor

@glours glours commented Mar 24, 2026

What I did
When using , containers that exit with code 0 (such as init or oneshot containers) were incorrectly treated as failures, causing the command to return exit code 1.

Introduce a typed containerExitError in isServiceHealthy and handle exit code 0 in the RunningOrHealthy wait path.
Services with restart policies that restart on clean exit (always, unless-stopped) are excluded via restartsOnExit0 to avoid false positives.

Related issue
Fixes #10596

Crafted with Claude Code

(not mandatory) A picture of a cute animal, if possible in relation to what you did
image

  When using , containers that exit with code 0 (such
  as init or oneshot containers) were incorrectly treated as failures,
  causing the command to return exit code 1.

  Introduce a typed containerExitError in isServiceHealthy and handle
  exit code 0 in the RunningOrHealthy wait path. Services with restart
  policies that restart on clean exit (always, unless-stopped) are
  excluded via restartsOnExit0 to avoid false positives.

  Fixes #10596

Signed-off-by: Guillaume Lours <glours@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 24, 2026 15:01
@glours glours requested a review from a team as a code owner March 24, 2026 15:01
@glours glours requested a review from ndeloof March 24, 2026 15:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes docker compose up --wait incorrectly returning exit code 1 when a (non-restarting) init/oneshot container exits successfully (exit code 0), by teaching the wait logic to treat clean exits as success and by introducing a typed error to reliably detect container exit status.

Changes:

  • Introduces containerExitError from isServiceHealthy to carry the container exit code without relying on string parsing.
  • Updates the running_or_healthy dependency wait path to treat exit code 0 as a successful completion when the service won’t be restarted on clean exit.
  • Adds unit tests for restart-policy detection and for the new typed exit error behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
pkg/compose/convergence.go Adds typed exit error, adds restart-on-exit0 detection, and adjusts --wait dependency handling for clean exits.
pkg/compose/convergence_test.go Extends isServiceHealthy tests to assert typed exit errors and exit code propagation.
pkg/compose/start_test.go Adds tests for restartsOnExit0 across service and deploy restart policy configurations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +573 to +580
switch service.Restart {
case types.RestartPolicyAlways, types.RestartPolicyUnlessStopped:
return true
}
if service.Deploy != nil && service.Deploy.RestartPolicy != nil {
switch service.Deploy.RestartPolicy.Condition {
case "any", "always", "unless-stopped":
return true
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restartsOnExit0 compares service.Restart directly to types.RestartPolicyAlways/UnlessStopped, but restart strings can legally include a suffix (e.g. "on-failure:3") and this codebase already parses restart policies via strings.Cut + mapRestartPolicyCondition (see getRestartPolicy in pkg/compose/create.go). As written, values like "always:1" or "any" would be misclassified, which can cause --wait to treat an exit(0) as success even though the service would be restarted. Consider parsing service.Restart the same way as getRestartPolicy and basing the decision on the mapped engine mode (Always/UnlessStopped).

Copilot uses AI. Check for mistakes.
Comment on lines +490 to +491
s.events.On(containerEvents(waitingFor, exited)...)
return nil
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the exit-code-0 success path, this treats the whole dependency as satisfied and emits "exited" events for all containers in waitingFor. If the dependency service is scaled >1 and only one replica exits(0) while others are still starting/unhealthy, waitDependencies would return success prematurely. Consider limiting this shortcut to single-container services, or re-checking every container in waitingFor and only succeeding when they have all cleanly exited(0) (and emitting events only for the containers that actually exited).

Suggested change
s.events.On(containerEvents(waitingFor, exited)...)
return nil
// Only treat this as a satisfied dependency when exactly one
// container is being waited on. For multi-container (scaled)
// services, we must not mark all containers as exited or
// consider the dependency satisfied based on a single replica.
if len(waitingFor) == 1 {
s.events.On(containerEvents(waitingFor, exited)...)
return nil
}
// For multiple containers, fall through to the standard
// required/optional handling below.

Copilot uses AI. Check for mistakes.
Comment on lines +485 to +492
// A container that exited with code 0 can be treated as
// successfully completed (init/oneshot containers) when the
// service won't be restarted by Docker after a clean exit.
var exitErr *containerExitError
if errors.As(err, &exitErr) && exitErr.exitCode == 0 && !restartsOnExit0(project, dep) {
s.events.On(containerEvents(waitingFor, exited)...)
return nil
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new "exit code 0 treated as success" behavior in the ServiceConditionRunningOrHealthy wait path isn’t covered by tests. There’s currently only minimal coverage for waitDependencies (scale=0 and service_started); adding a unit test that simulates ContainerInspect returning Status=exited/ExitCode=0 and asserts waitDependencies returns nil (and does not fail the overall wait) would help prevent regressions.

Copilot uses AI. Check for mistakes.
@ndeloof
Copy link
Contributor

ndeloof commented Mar 24, 2026

note to myself: need to check we get this same fix in #13641

@ndeloof
Copy link
Contributor

ndeloof commented Mar 24, 2026

I'm not convinced this is the right approach. IMHO we miss an explicit definition for init containers in the compose spec, which we could rely on to distinguish vs long-running services, and act accordingly. If we had such a service-level attribute declared, we could automatically select condition: service_completed_successfully as default depends_on condition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] compose up --wait exits 1 on init containers successfully completing

3 participants