Skip to content

CNTRLPLANE-3115: Add OCP 4.15/4.16 to envtest integration tests#8111

Draft
JoelSpeed wants to merge 10 commits intoopenshift:mainfrom
JoelSpeed:envtest-4.15-4.16
Draft

CNTRLPLANE-3115: Add OCP 4.15/4.16 to envtest integration tests#8111
JoelSpeed wants to merge 10 commits intoopenshift:mainfrom
JoelSpeed:envtest-4.15-4.16

Conversation

@JoelSpeed
Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

We understand that there are management clusters running HyperShift Operator with openshift versions as old as 4.15.
We must ensure we add integration testing going all the way back to 4.15 to catch any potential breaking API changes that we might make.

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

clebs and others added 10 commits March 25, 2026 15:18
- Add an envtest setup goal on the Makefile
- Add a new unit test using envtest to cover the same as the create
  cluster e2e test.

Signed-off-by: Borja Clemente <bclement@redhat.com>
Introducing envtest requires adding new dependencies, which are being
vendored in a separate commit to ease review.

Signed-off-by: Borja Clemente <bclement@redhat.com>
Replace the Go-based envtest test with a YAML-driven test framework
following the openshift/api tests pattern. The framework:

- Loads test suites from tests/<crdname>/ directories matching o/api layout
- Resolves CRDs via ../../zz_generated.crd-manifests/ relative paths
- Filters by featureGates using payload-manifests/featuregates/
- Per suite, installs/uninstalls the CRD under test
- Supports expectedError, expectedStatusError for validation tests
- Supports initialCRDPatches for ratcheting validation via yaml-patch
- Uses //go:build envtest tag so tests are excluded from go test ./...

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
We know that we currently have 4.15 clusters running HyperShift
Operator in some cases. We must ensure cross version testing back
to our earliest supported version.
@openshift-ci-robot
Copy link
Copy Markdown

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 30, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 30, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 30, 2026

@JoelSpeed: This pull request references CNTRLPLANE-3115 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

We understand that there are management clusters running HyperShift Operator with openshift versions as old as 4.15.
We must ensure we add integration testing going all the way back to 4.15 to catch any potential breaking API changes that we might make.

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: ebe20336-0f88-46a0-9ab2-f0cf097d26f8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added do-not-merge/needs-area area/ci-tooling Indicates the PR includes changes for CI or tooling area/cli Indicates the PR includes changes for CLI area/platform/aws PR/issue for AWS (AWSPlatform) platform area/platform/azure PR/issue for Azure (AzurePlatform) platform area/platform/gcp PR/issue for GCP (GCPPlatform) platform and removed do-not-merge/needs-area labels Mar 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: JoelSpeed
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the area/testing Indicates the PR includes changes for e2e testing label Mar 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 17, 2026

@JoelSpeed: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@hypershift-jira-solve-ci
Copy link
Copy Markdown

I now have complete evidence for both failures. Here is the report:

Test Failure Analysis Complete

Job Information

  • Prow Job: PR CNTRLPLANE-3115: Add OCP 4.15/4.16 to envtest integration tests #8111CNTRLPLANE-3115: Add OCP 4.15/4.16 to envtest integration tests
  • Build ID (Verify): GitHub Actions run 23750527023 / job 69190977194
  • Build ID (Unit Tests): GitHub Actions run 23750526985 / job 69190976829
  • Branch: envtest-4.15-4.16
  • Repository: openshift/hypershift

Test Failure Analysis

Error

Unit Tests (18 failures):
  unable to create CRD "hostedclusters.hypershift.openshift.io": compilation failed:
  ERROR: <input>:1:11: unsupported syntax '.?'
  | (size(self.?claim.orValue("")) > 0) ? !has(self.expression) : true
  |           ^
  ERROR: <input>:1:5: unsupported syntax '.?'
  | self.?discoveryURL.orValue("").size() > 0 ? ...
  |     ^

Verify (4 files stale):
  featureGate-Hypershift-Default.yaml: needs update
  featureGate-Hypershift-TechPreviewNoUpgrade.yaml: needs update
  featureGate-SelfManagedHA-Default.yaml: needs update
  featureGate-SelfManagedHA-TechPreviewNoUpgrade.yaml: needs update

Summary

Both failures stem from the same PR adding OCP 4.15 (K8s 1.28) and OCP 4.16 (K8s 1.29) to the envtest matrix. The Unit Tests job fails because K8s 1.28's kube-apiserver does not support the CEL optional chaining syntax (.?) introduced in Kubernetes 1.29, and the HostedCluster CRD (sourced from vendored openshift/api) uses this syntax in its x-kubernetes-validations rules. All 18 test failures occur during the first iteration (K8s 1.28.15) in the BeforeEach CRD installation step, timing out after 30s on repeated failed attempts. The Verify job fails because 4 featureGate YAML manifests were not regenerated after the PR's changes — running make generate would update them, but the PR omitted this step.

Root Cause

Unit Tests — CEL version incompatibility with K8s 1.28:

The PR added 1.28.15 (OCP 4.15) to ENVTEST_OCP_K8S_VERSIONS in the Makefile (line 343). The make test target calls test-envtest-api-all, which iterates over each K8s version, downloads the corresponding kube-apiserver binary via setup-envtest, and runs the envtest suite against it.

The HostedCluster CRD (hostedclusters.hypershift.openshift.io) contains CEL validation rules from the vendored openshift/api that use the optional chaining operator (self.?field), specifically:

  1. spec.configuration.authentication.oidcProviders[].claimMappings.groups — rule: (size(self.?claim.orValue("")) > 0) ? !has(self.expression) : true
  2. spec.configuration.authentication.oidcProviders[].issuer — rule: self.?discoveryURL.orValue("").size() > 0 ? ...

The .? (optional field access) syntax is part of CEL's optional types library, which was added to the Kubernetes CRD validation engine in Kubernetes 1.29 (via KEP-2876). K8s 1.28's kube-apiserver uses an older CEL runtime that does not recognize this syntax, causing immediate compilation failed: unsupported syntax '.?' errors when attempting to install the CRD.

Since test-envtest-ocp exits on the first failure (|| exit 1), it never progresses past K8s 1.28 — all 18 failures are from this single version.

Verify — Stale featureGate manifests:

The Verify job runs make generate followed by git diff --exit-code. The 4 featureGate YAML files (featureGate-Hypershift-Default.yaml, featureGate-Hypershift-TechPreviewNoUpgrade.yaml, featureGate-SelfManagedHA-Default.yaml, featureGate-SelfManagedHA-TechPreviewNoUpgrade.yaml) were modified by the PR but are now stale relative to what make generate produces. This is likely because upstream openshift/api introduced or changed feature gate definitions (e.g., HyperShiftOnlyDynamicResourceAllocation vs DynamicResourceAllocation) and the PR did not run make generate after pulling in the latest changes.

Recommendations
  1. Remove K8s 1.28 (OCP 4.15) from the envtest matrix — The HostedCluster CRD has an inherent dependency on CEL optional types (.? syntax from vendored openshift/api) that cannot be installed on K8s 1.28. Since these CEL rules come from upstream openshift/api and are baked into the CRD schema, they cannot be conditionally excluded. Change ENVTEST_OCP_K8S_VERSIONS in the Makefile to start at 1.29.7 instead of 1.28.15.

  2. Alternatively, if K8s 1.28 compatibility is required, the CRD would need version-specific schema variants that strip .? CEL rules for older K8s versions. This is a significant architectural change and may not be practical since the rules originate from openshift/api.

  3. Run make generate and commit the results — The 4 stale featureGate YAML files need to be regenerated. Run make generate and commit the updated files to fix the Verify job.

  4. Validate K8s 1.29 (OCP 4.16) separately — K8s 1.29 introduced .? support, so it should work. However, since the test suite exits on first failure, 1.29+ was never tested in this run. After fixing 1.28, verify 1.29 passes.

Evidence
Evidence Detail
Unit Tests — Failed step Run make test — 18 of 262 specs failed, 157 skipped
Unit Tests — First K8s version tested === Running envtest for OCP (K8s 1.28.15) === — never reached 1.29+
Unit Tests — CEL error (rule 1) self.?claim.orValue("") in oidcProviders[].claimMappings.groups.x-kubernetes-validations[0]unsupported syntax '.?'
Unit Tests — CEL error (rule 2) self.?discoveryURL.orValue("") in oidcProviders[].issuer.x-kubernetes-validations[0]unsupported syntax '.?'
Unit Tests — CEL origin Vendored from openshift/apiconfig/v1/types_authentication.go
Unit Tests — Failure location test/envtest/generator.go:189 (BeforeEach CRD install) and :241 (CRD install test)
Unit Tests — Affected feature sets CustomNoUpgrade (8 failures), TechPreviewNoUpgrade (9 failures), Default (1 failure)
Unit Tests — Timeout behavior Each CRD install retried for 30s before failing (30.189s, 30.191s, etc.)
Verify — Failed step Run git update-index --refreshgit diff --exit-code detected uncommitted changes
Verify — Stale files 4 featureGate YAMLs under cmd/install/assets/hypershift-operator/payload-manifests/featuregates/
PR change ENVTEST_OCP_K8S_VERSIONS changed from 1.30.3 1.31.2 ... to 1.28.15 1.29.7 1.30.3 1.31.2 ...
K8s CEL optional types Added in K8s 1.29 via KEP-2876; K8s 1.28 does not support .? syntax

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci-tooling Indicates the PR includes changes for CI or tooling area/cli Indicates the PR includes changes for CLI area/platform/aws PR/issue for AWS (AWSPlatform) platform area/platform/azure PR/issue for Azure (AzurePlatform) platform area/platform/gcp PR/issue for GCP (GCPPlatform) platform area/testing Indicates the PR includes changes for e2e testing do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants