Skip to content

✨ Concurrent E2E Test Execution#2675

Open
dtfranz wants to merge 1 commit into
operator-framework:mainfrom
dtfranz:concurrent-e2e
Open

✨ Concurrent E2E Test Execution#2675
dtfranz wants to merge 1 commit into
operator-framework:mainfrom
dtfranz:concurrent-e2e

Conversation

@dtfranz
Copy link
Copy Markdown
Contributor

@dtfranz dtfranz commented Apr 27, 2026

Description

Takes advantage of changes made to isolate test runs (#2651) to execute as many tests in parallel as possible. For tests that must be run serially, the @Serial tag has been added to the beginning of relevant feature file(s).

Tests are now executed in two steps: parallel followed by serial. Since this creates two distinct test outputs, the final output has been modified to combine any step failures into one smaller output at the end, while still maintaining live output to stdout. Ideally this could be formatted into a nicer looking markdown output in the test summary that we already output, but I'm saving that for a future PR.

go test execution speedup results (compared to a recent run):

type before after
e2e 6m14s 3m09s (parallel + serial)
experimental-e2e 24m30s 11m39s (parallel + serial)

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

Copilot AI review requested due to automatic review settings April 27, 2026 05:55
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 27, 2026

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 5874537
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/6a016731c7915400087d8d28
😎 Deploy Preview https://deploy-preview-2675--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the E2E Godog test runner to take advantage of scenario isolation (from #2651) by running most scenarios in parallel while reserving tagged scenarios for a serial pass, and then printing a combined failure summary.

Changes:

  • Split E2E execution into two Godog runs: parallel (excluding @Serial) followed by serial (@Serial only), with aggregated failure output.
  • Tag TLS E2E feature(s) with @Serial so they run only in the serial phase.
  • Reduce test-experimental-e2e timeout from 20m to 10m.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
test/e2e/features_test.go Runs E2E suite in parallel + serial phases and prints an aggregated failure summary.
test/e2e/features/tls.feature Marks TLS feature as @Serial to exclude it from the parallel phase.
Makefile Adjusts experimental E2E timeout setting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/features_test.go Outdated
Comment thread test/e2e/features_test.go Outdated
Comment thread Makefile Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.08%. Comparing base (6e39708) to head (5874537).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2675      +/-   ##
==========================================
- Coverage   68.10%   68.08%   -0.02%     
==========================================
  Files         145      145              
  Lines       10700    10700              
==========================================
- Hits         7287     7285       -2     
- Misses       2885     2886       +1     
- Partials      528      529       +1     
Flag Coverage Δ
e2e 37.11% <ø> (-0.07%) ⬇️
experimental-e2e 52.50% <ø> (ø)
unit 53.79% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dtfranz dtfranz marked this pull request as draft April 27, 2026 06:14
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 27, 2026
@dtfranz dtfranz marked this pull request as ready for review April 27, 2026 08:18
Copilot AI review requested due to automatic review settings April 27, 2026 08:18
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 27, 2026
@openshift-ci openshift-ci Bot requested review from joelanford and tmshort April 27, 2026 08:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/features_test.go Outdated
Comment thread test/e2e/features_test.go Outdated
Comment thread test/e2e/features_test.go Outdated
Comment on lines +68 to +79
// Create buffers to capture output for final summary
var parallelBuf, serialBuf bytes.Buffer

parallelOpts := cliOpts
if parallelOpts.Concurrency == 1 {
// Override default concurrency value with 100; otherwise use whatever was provided by CLI
parallelOpts.Concurrency = 100
}
parallelOpts.Tags = "~@Serial"
// Write to both specified output (live to stdout, by default) and buffer (for summary)
parallelOpts.Output = io.MultiWriter(parallelOpts.Output, &parallelBuf)
// run tests concurrently
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the entire godog console output into in-memory buffers (parallelBuf/serialBuf) via io.MultiWriter. With many scenarios (and especially at high concurrency), this can consume a lot of memory and slow down CI due to extra copying. Consider limiting what’s captured (e.g., only buffering after the "--- Failed steps:" marker, keeping only the last N lines, or writing captured output to a temp file and only reading it back on failure).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 27, 2026 23:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/README.md Outdated
go test test/e2e/features_test.go --godog.tags="@WebhookProviderCertManager"
```

Note that setting the tags in this way will disable the automatic test parallelization. If running in parallel with custom tags is desired, set `--godog.concurrency=100` for instance to re-enable. If this is done adding `&& ~@Serial` to the tags as well is highly recommended:
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence starting with “If this is done…” is grammatically awkward and a bit hard to parse. Consider rephrasing (e.g., “If you do this, adding && ~@Serial to the tag expression is highly recommended…”) to make the guidance clearer.

Suggested change
Note that setting the tags in this way will disable the automatic test parallelization. If running in parallel with custom tags is desired, set `--godog.concurrency=100` for instance to re-enable. If this is done adding `&& ~@Serial` to the tags as well is highly recommended:
Note that setting the tags in this way will disable the automatic test parallelization. If running in parallel with custom tags is desired, set `--godog.concurrency=100` for instance to re-enable. If you do this, adding `&& ~@Serial` to the tags is also highly recommended:

Copilot uses AI. Check for mistakes.
@tmshort
Copy link
Copy Markdown
Contributor

tmshort commented Apr 28, 2026

There are also tests in proxy.feature tests that we may want to be @Serial.

@dtfranz
Copy link
Copy Markdown
Contributor Author

dtfranz commented Apr 30, 2026

There are also tests in proxy.feature tests that we may want to be @Serial.

Good callout; I haven't seen that fail yet, so it might be fine to leave it, or we could just get ahead of it. I'm fine either way.

@tmshort
Copy link
Copy Markdown
Contributor

tmshort commented Apr 30, 2026

Let's get ahead of it, since the proxy tests may be disruptive.

@dtfranz
Copy link
Copy Markdown
Contributor Author

dtfranz commented May 1, 2026

Let's get ahead of it, since the proxy tests may be disruptive.

Done!

@tmshort
Copy link
Copy Markdown
Contributor

tmshort commented May 1, 2026

/approve

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2026
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 1, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rashmigottipati, tmshort

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pedjak
Copy link
Copy Markdown
Contributor

pedjak commented May 4, 2026

Suggestion: Simplify by moving orchestration to the Makefile

The two-phase orchestration in Go (dual TestSuite.Run() calls, io.MultiWriter buffer capture, extractFailedSteps parsing) adds ~120 lines of complexity that the Makefile can handle for free.

Alternative approach

Keep the @Serial tags on the feature files (that part is great), but replace the Go-level orchestration with two go test invocations in the Makefile:

e2e:
	go test -count=1 -v ./test/e2e/features_test.go -timeout=$(E2E_TIMEOUT) \
	  $(if $(GODOG_ARGS),-args $(GODOG_ARGS),-args --godog.tags="~@Serial" --godog.concurrency=100)
	go test -count=1 -v ./test/e2e/features_test.go -timeout=$(E2E_TIMEOUT) \
	  $(if $(GODOG_ARGS),-args $(GODOG_ARGS),-args --godog.tags="@Serial" --godog.concurrency=1)

This would mean:

  • No changes to features_test.go — the existing code stays as-is
  • The shell naturally shows live output from each phase
  • CI already reports which command failed
  • Custom GODOG_ARGS still override everything (same as today)
  • The only Go-side changes are the @Serial tags on ha.feature, proxy.feature, and tls.feature

Safety net: skip @Serial scenarios in BeforeScenario hook

As a guard against someone accidentally running @Serial tests with high concurrency (e.g. manually passing --godog.concurrency=100 without excluding @Serial), we could add a check in CheckFeatureTags:

func CheckFeatureTags(ctx context.Context, sc *godog.Scenario) (context.Context, error) {
	for _, tag := range sc.Tags {
		if tag.Name == "@Serial" && opts.Concurrency > 1 {
			logger.Info(fmt.Sprintf("Skipping scenario %q because it requires serial execution (concurrency=%d)", sc.Name, opts.Concurrency))
			return ctx, godog.ErrSkip
		}
		if enabled, found := featureGates[featuregate.Feature(tag.Name[1:])]; found && !enabled {
			logger.Info(fmt.Sprintf("Skipping scenario %q because feature gate %q is disabled", sc.Name, tag.Name[1:]))
			return ctx, godog.ErrSkip
		}
	}
	return ctx, nil
}

This way @Serial scenarios are never silently run concurrently, regardless of how go test is invoked.

Why not a BeforeScenario hook alone?

The hook only skips serial tests — it doesn't eliminate the need for a second go test invocation to actually run them. The Makefile handles the two-phase orchestration, and the hook acts purely as a safety net.

Trade-off

The only thing lost is the combined failure summary at the end. But each go test invocation already prints its own godog summary, and CI runners show per-step pass/fail natively.

Copilot AI review requested due to automatic review settings May 8, 2026 08:19
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label May 8, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 8, 2026

New changes are detected. LGTM label has been removed.

@dtfranz
Copy link
Copy Markdown
Contributor Author

dtfranz commented May 8, 2026

I've restored the go test code in favor of splitting the runs in the makefile - PTAL @pedjak

The test results will be a little more difficult to see but we can probably do something about that in a future PR.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread Makefile Outdated
Comment thread test/e2e/README.md Outdated
@dtfranz
Copy link
Copy Markdown
Contributor Author

dtfranz commented May 8, 2026

Been seeing test failures; if they continue I may need to run that test as @Serial, assuming that's the reason

Copilot AI review requested due to automatic review settings May 11, 2026 01:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread test/e2e/README.md Outdated
Comment thread test/e2e/README.md
Comment thread test/e2e/features/tls.feature
Copilot AI review requested due to automatic review settings May 11, 2026 03:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread Makefile Outdated
Takes advantage of changes made to isolate test runs to execute as many tests in parallel as possible. For tests that must be run serially, the @serial tag has been added to the beginning of relevant feature file(s).

Signed-off-by: Daniel Franz <dfranz@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@pedjak pedjak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR — splitting e2e execution into parallel and serial phases cuts wall-clock time nearly in half. The @Serial tagging is well-chosen and the README updates are clear. One critical bug and one nit noted inline.

Comment thread Makefile
echo "e2e tests failed: tests timed out"; \
exit 1; \
elif [[ $$parallelExitCode -ne 0 ]] || [[ $$serialExitCode -ne 0 ]]; then \
echo "e2e tests failed: parallel=$$parallelExitCode serial=$$serialExitCode"; \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: variable scoping silently swallows test failures

parallelExitCode and serialExitCode are set inside the bash -c '...' subshell, but read outside it. Variables set in a subshell are not visible to the calling shell.

In bash, [[ "" -ne 0 ]] treats the empty string as 0, so the elif is always false. Additionally, the last statement inside bash -c is a variable assignment (serialExitCode=$?), which always exits 0, so timeout also returns 0 regardless of test results.

Net effect: if either test phase fails, make e2e still exits 0. CI passes today because the tests pass — this only manifests on failure, which is exactly when you need it.

Suggested fix — move the exit-code logic inside the bash -c subshell:

ifeq ($(strip $(GODOG_ARGS)),)
	set +e; \
	timeout -s SIGTERM $(E2E_TIMEOUT) bash -c ' \
		go test -count=1 -v ./test/e2e/features_test.go -args --godog.tags="~@Serial" --godog.concurrency=100; \
		parallelExit=$$?; \
		go test -count=1 -v ./test/e2e/features_test.go -args --godog.tags="@Serial" --godog.concurrency=1; \
		serialExit=$$?; \
		if [[ $$parallelExit -ne 0 ]] || [[ $$serialExit -ne 0 ]]; then \
			echo "e2e tests failed: parallel=$$parallelExit serial=$$serialExit"; \
			exit 1; \
		fi'; \
	exitCode=$$?; \
	if [[ $$exitCode -ne 0 ]]; then \
		exit 1; \
	fi
else

# restores the original configuration during cleanup, so scenarios are independent.

# This feature file is run serially to avoid potential issues modifying the catalogd
# deployment during functional testing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: trailing whitespace on lines 9 and 11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants