Skip to content

[OSDOCS#20426]: Added files for zstream release notes 4.22.3#114333

Open
bjahagir-OpenShift wants to merge 1 commit into
openshift:enterprise-4.22from
bjahagir-OpenShift:bjahagir-4.22.3
Open

[OSDOCS#20426]: Added files for zstream release notes 4.22.3#114333
bjahagir-OpenShift wants to merge 1 commit into
openshift:enterprise-4.22from
bjahagir-OpenShift:bjahagir-4.22.3

Conversation

@bjahagir-OpenShift

@bjahagir-OpenShift bjahagir-OpenShift commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@openshift-ci openshift-ci Bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 30, 2026
@ocpdocs-previewbot

Copy link
Copy Markdown

🤖 Tue Jun 30 08:19:50 - Prow CI generated the docs preview:

https://114333--ocpdocs-pr.netlify.app/openshift-enterprise/latest/release_notes/ocp-4-22-release-notes.html

[id="zstream-4-22-3-fixed-issues_{context}"]
== Fixed issues

* Before this update, during parallel deployments of Single Node OpenShift (SNO), specifically with hypervisor-based SNOs, some systems would hang in the middle of the deployment due to a race condition involving the BareMetalHost (BMH) custom resource. The SNO was never powered on by metal3 after virtual media was attached. As a consequence, parallel deployments at scale (10+ nodes) had approximately 80% success rate, with the remaining nodes requiring manual intervention to patch the `online` field to `true` for the impacted BMH custom resources. With this release, the race condition in the BMH power-on process has been resolved. As a result, parallel SNO deployments successfully power on all nodes without manual intervention, even at scale. (link:https://issues.redhat.com/browse/OCPBUGS-73622[OCPBUGS-73622])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] Vale.Avoid: Avoid using 'Single Node OpenShift'.

[id="zstream-4-22-3-fixed-issues_{context}"]
== Fixed issues

* Before this update, during parallel deployments of Single Node OpenShift (SNO), specifically with hypervisor-based SNOs, some systems would hang in the middle of the deployment due to a race condition involving the BareMetalHost (BMH) custom resource. The SNO was never powered on by metal3 after virtual media was attached. As a consequence, parallel deployments at scale (10+ nodes) had approximately 80% success rate, with the remaining nodes requiring manual intervention to patch the `online` field to `true` for the impacted BMH custom resources. With this release, the race condition in the BMH power-on process has been resolved. As a result, parallel SNO deployments successfully power on all nodes without manual intervention, even at scale. (link:https://issues.redhat.com/browse/OCPBUGS-73622[OCPBUGS-73622])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] Vale.Avoid: Avoid using 'SNO'.

[id="zstream-4-22-3-fixed-issues_{context}"]
== Fixed issues

* Before this update, during parallel deployments of Single Node OpenShift (SNO), specifically with hypervisor-based SNOs, some systems would hang in the middle of the deployment due to a race condition involving the BareMetalHost (BMH) custom resource. The SNO was never powered on by metal3 after virtual media was attached. As a consequence, parallel deployments at scale (10+ nodes) had approximately 80% success rate, with the remaining nodes requiring manual intervention to patch the `online` field to `true` for the impacted BMH custom resources. With this release, the race condition in the BMH power-on process has been resolved. As a result, parallel SNO deployments successfully power on all nodes without manual intervention, even at scale. (link:https://issues.redhat.com/browse/OCPBUGS-73622[OCPBUGS-73622])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] Vale.Avoid: Avoid using 'SNO'.


* Before this update, during parallel deployments of Single Node OpenShift (SNO), specifically with hypervisor-based SNOs, some systems would hang in the middle of the deployment due to a race condition involving the BareMetalHost (BMH) custom resource. The SNO was never powered on by metal3 after virtual media was attached. As a consequence, parallel deployments at scale (10+ nodes) had approximately 80% success rate, with the remaining nodes requiring manual intervention to patch the `online` field to `true` for the impacted BMH custom resources. With this release, the race condition in the BMH power-on process has been resolved. As a result, parallel SNO deployments successfully power on all nodes without manual intervention, even at scale. (link:https://issues.redhat.com/browse/OCPBUGS-73622[OCPBUGS-73622])

* Before this update, the `kube-apiserver-check-endpoints` container generated a TLS certificate for the `check-endpoint` service on port `17697` with a validity of only 1 second. As a consequence, the certificate expired almost immediately after generation, which differed from previous OpenShift Container Platform versions where the certificate was valid for 1 month. With this release, the `kube-apiserver-check-endpoints` container generates certificates with an appropriate validity period. As a result, the `check-endpoint` service certificate remains valid for the expected duration, consistent with previous releases. (link:https://issues.redhat.com/browse/OCPBUGS-84536[OCPBUGS-84536])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] OpenShiftAsciiDoc.SuggestAttribute: Use the AsciiDoc attribute '{product-title}' rather than the plain text product term 'OpenShift Container Platform', unless your use case is an exception.


* Before this update, the `kube-apiserver-check-endpoints` container generated a TLS certificate for the `check-endpoint` service on port `17697` with a validity of only 1 second. As a consequence, the certificate expired almost immediately after generation, which differed from previous OpenShift Container Platform versions where the certificate was valid for 1 month. With this release, the `kube-apiserver-check-endpoints` container generates certificates with an appropriate validity period. As a result, the `check-endpoint` service certificate remains valid for the expected duration, consistent with previous releases. (link:https://issues.redhat.com/browse/OCPBUGS-84536[OCPBUGS-84536])

* Before this update, the oslat latency test hardcoded the runner pod memory to 1 GB regardless of the `LATENCY_TEST_CPUS` value. As a consequence, when running the CNF latency test with high CPU counts such as `LATENCY_TEST_CPUS=126`, the oslat pod was OOMKilled because the fixed 1 GB memory limit was insufficient, blocking hardware platform evaluation. With this release, the oslat test runner pod memory is configurable or appropriately scaled based on the `LATENCY_TEST_CPUS` setting. As a result, the documented CNF latency test flow completes successfully with higher CPU counts without OOMKilled failures. (link:https://issues.redhat.com/browse/OCPBUGS-86071[OCPBUGS-86071])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] RedHat.TermsErrors: Use 'hard-coded' rather than 'hardcoded'. For more information, see RedHat.TermsErrors.


* Before this update, the oslat latency test hardcoded the runner pod memory to 1 GB regardless of the `LATENCY_TEST_CPUS` value. As a consequence, when running the CNF latency test with high CPU counts such as `LATENCY_TEST_CPUS=126`, the oslat pod was OOMKilled because the fixed 1 GB memory limit was insufficient, blocking hardware platform evaluation. With this release, the oslat test runner pod memory is configurable or appropriately scaled based on the `LATENCY_TEST_CPUS` setting. As a result, the documented CNF latency test flow completes successfully with higher CPU counts without OOMKilled failures. (link:https://issues.redhat.com/browse/OCPBUGS-86071[OCPBUGS-86071])

* Before this update, when the `etcd-endpoints` configmap contained only the IP addresses of failed or unreachable etcd members, the etcd-operator entered a permanent deadlock. The EtcdEndpointsController, which updates the configmap, required a working etcd connection to list members, but the etcd client pool read endpoints exclusively from the stale configmap. This circular dependency prevented all operator controllers from functioning. As a consequence, the operator retried indefinitely against dead endpoints, logging `context deadline exceeded` errors continuously, and required manual intervention to patch the configmap with healthy member IPs. With this release, the operator detects when all configmap-derived endpoints are unreachable and falls back to node-based endpoint discovery to re-establish connectivity with healthy etcd members. As a result, the EtcdEndpointsController automatically updates the configmap with correct IPs and recovery proceeds without manual intervention. (link:https://issues.redhat.com/browse/OCPBUGS-88490[OCPBUGS-88490])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] Vale.Terms: Use 'Operators?' instead of 'operator'.


* Before this update, the oslat latency test hardcoded the runner pod memory to 1 GB regardless of the `LATENCY_TEST_CPUS` value. As a consequence, when running the CNF latency test with high CPU counts such as `LATENCY_TEST_CPUS=126`, the oslat pod was OOMKilled because the fixed 1 GB memory limit was insufficient, blocking hardware platform evaluation. With this release, the oslat test runner pod memory is configurable or appropriately scaled based on the `LATENCY_TEST_CPUS` setting. As a result, the documented CNF latency test flow completes successfully with higher CPU counts without OOMKilled failures. (link:https://issues.redhat.com/browse/OCPBUGS-86071[OCPBUGS-86071])

* Before this update, when the `etcd-endpoints` configmap contained only the IP addresses of failed or unreachable etcd members, the etcd-operator entered a permanent deadlock. The EtcdEndpointsController, which updates the configmap, required a working etcd connection to list members, but the etcd client pool read endpoints exclusively from the stale configmap. This circular dependency prevented all operator controllers from functioning. As a consequence, the operator retried indefinitely against dead endpoints, logging `context deadline exceeded` errors continuously, and required manual intervention to patch the configmap with healthy member IPs. With this release, the operator detects when all configmap-derived endpoints are unreachable and falls back to node-based endpoint discovery to re-establish connectivity with healthy etcd members. As a result, the EtcdEndpointsController automatically updates the configmap with correct IPs and recovery proceeds without manual intervention. (link:https://issues.redhat.com/browse/OCPBUGS-88490[OCPBUGS-88490])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] Vale.Terms: Use 'Operators?' instead of 'operator'.


* Before this update, the oslat latency test hardcoded the runner pod memory to 1 GB regardless of the `LATENCY_TEST_CPUS` value. As a consequence, when running the CNF latency test with high CPU counts such as `LATENCY_TEST_CPUS=126`, the oslat pod was OOMKilled because the fixed 1 GB memory limit was insufficient, blocking hardware platform evaluation. With this release, the oslat test runner pod memory is configurable or appropriately scaled based on the `LATENCY_TEST_CPUS` setting. As a result, the documented CNF latency test flow completes successfully with higher CPU counts without OOMKilled failures. (link:https://issues.redhat.com/browse/OCPBUGS-86071[OCPBUGS-86071])

* Before this update, when the `etcd-endpoints` configmap contained only the IP addresses of failed or unreachable etcd members, the etcd-operator entered a permanent deadlock. The EtcdEndpointsController, which updates the configmap, required a working etcd connection to list members, but the etcd client pool read endpoints exclusively from the stale configmap. This circular dependency prevented all operator controllers from functioning. As a consequence, the operator retried indefinitely against dead endpoints, logging `context deadline exceeded` errors continuously, and required manual intervention to patch the configmap with healthy member IPs. With this release, the operator detects when all configmap-derived endpoints are unreachable and falls back to node-based endpoint discovery to re-establish connectivity with healthy etcd members. As a result, the EtcdEndpointsController automatically updates the configmap with correct IPs and recovery proceeds without manual intervention. (link:https://issues.redhat.com/browse/OCPBUGS-88490[OCPBUGS-88490])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] Vale.Terms: Use 'Operators?' instead of 'operator'.

[id="zstream-4-22-3-updating_{context}"]
== Updating

To update an {product-title} 4.22 cluster to this latest release, see xref:../updating/updating_a_cluster/updating-cluster-cli.adoc#updating-cluster-cli[Updating a cluster using the CLI].

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] OpenShiftAsciiDoc.NoXrefInModules: Do not include xrefs in modules, only assemblies (exception: release notes modules).

@openshift-ci

openshift-ci Bot commented Jun 30, 2026

Copy link
Copy Markdown

@bjahagir-OpenShift: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants