Skip to content

Conversation

@mbobrovskyi
Copy link
Contributor

@mbobrovskyi mbobrovskyi commented Mar 26, 2025

What type of PR is this?

/kind bug

What this PR does / why we need it:

Add StatefulSet as owner of Workload.

Which issue(s) this PR fixes:

Fixes #4342
Fixes #4805
Fixes #6883
Fixes #7366

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fixed the following bugs for the StatefulSet integration by ensuring the Workload object
has the ownerReference to the StatefulSet:
1. Kueue doesn't keep the StatefulSet as deactivated
2. Kueue marks the Workload as Finished if all StatefulSet's Pods are deleted
3. changing the "queue-name" label could occasionally result in the StatefulSet getting stuck

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. labels Mar 26, 2025
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 26, 2025
@netlify
Copy link

netlify bot commented Mar 26, 2025

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 367ac58
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/6931169bda24f3000840cb1b

@mbobrovskyi mbobrovskyi force-pushed the fix/sts-workloads-get-deleted-on-preemption branch from 8cffe93 to d8935ee Compare March 26, 2025 13:22
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 26, 2025
@mbobrovskyi mbobrovskyi force-pushed the fix/sts-workloads-get-deleted-on-preemption branch from d8935ee to 8634d41 Compare March 26, 2025 13:30
@mbobrovskyi mbobrovskyi marked this pull request as ready for review March 26, 2025 13:32
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2025
@k8s-ci-robot k8s-ci-robot requested a review from kannon92 March 26, 2025 13:32
@mbobrovskyi mbobrovskyi force-pushed the fix/sts-workloads-get-deleted-on-preemption branch 2 times, most recently from 422353b to 28a1799 Compare March 26, 2025 13:43
@mbobrovskyi
Copy link
Contributor Author

/cc @mimowo

@k8s-ci-robot k8s-ci-robot requested a review from mimowo March 26, 2025 13:46
@mbobrovskyi mbobrovskyi force-pushed the fix/sts-workloads-get-deleted-on-preemption branch 2 times, most recently from 7245480 to 4422733 Compare March 26, 2025 14:17
Comment on lines +164 to +170
shouldUpdate = true
err = controllerutil.RemoveOwnerReference(sts, wl, r.client.Scheme())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe why we remove the owner reference from the workload on scaling down the StatefulSet?

IIUC this will result in deleting the workload, right? It seems safer to just update the count in the workload object, similarly as we would do for Jobs. For example, when we suspend a Job we don't delete the workload IIRC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, when we suspend a Job we don't delete the workload IIRC.

It is a very interesting behavior in Job. I tried to suspend the job manually, but it automatically unsuspends itself. The only way to suspend the job is to deactivate the workload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe why we remove the owner reference from the workload on scaling down the StatefulSet?

The problem is that after activation, the replicas can change (replicas=3 → replicas=0 → replicas=5). Our workload is admitted, and we can't change the PodSet count since this field is immutable. I think it's much easier to just delete it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this will result in deleting the workload, right?

Yeah, you right. It should delete workload.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for Job when the count is updated, then we update the count in workload.

So, my natural preference is to also update the count in the workload for STS, not sure it is much more complex. I'm not sure myself - considering the options.

The problem is that after activation, the replicas can change (replicas=3 → replicas=0 → replicas=5). Our workload is admitted, and we can't change the PodSet count since this field is immutable. I think it's much easier to just delete it.

I see, but to me it suggests we should also evict the workload for STS in that case, to be consistent with Job.
Can you try to prototype, maybe in a separate PR so that we can compare. I don't think we need to rush with decisions. I would prefer to do it well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the investigation and answers. The PR looks good assuming we take the approach of deleting the workload, but I would prefer to explore keeping and updating the workload, as we do for Jobs.

@mbobrovskyi mbobrovskyi force-pushed the fix/sts-workloads-get-deleted-on-preemption branch 2 times, most recently from 356b381 to 5b20f33 Compare March 26, 2025 15:57
@mimowo
Copy link
Contributor

mimowo commented Mar 27, 2025

/hold
to avoid premature merging, I need more time to decide on the approach.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 27, 2025
// Finished means whether the job is completed/failed or not,
// condition represents the workload finished condition.
func (p *Pod) Finished() (message string, success, finished bool) {
if p.isServing() {
Copy link
Contributor

@mimowo mimowo Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bugfix for an issue on its own. Let me open so that we can fix issues one-by-one rather than multiple in one PR, because then some overlapping issues make it hard to reason about the solution. Solving one-by-one will also allow us to build a good test coverage. Opened: #4805

Copy link
Contributor Author

@mbobrovskyi mbobrovskyi Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move these changes to a separate PR, but I'm not sure how to test them yet. We still need the the OwnerReference of the StatefulSet to prevent it from being deleted.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2025
@mbobrovskyi
Copy link
Contributor Author

/remove-lifecycle rotten

@mbobrovskyi mbobrovskyi force-pushed the fix/sts-workloads-get-deleted-on-preemption branch from 9568409 to 367ac58 Compare December 4, 2025 05:05
@mimowo
Copy link
Contributor

mimowo commented Dec 5, 2025

/release-note-edit

Fixed the following bugs for the StatefulSet integration by ensuring the Workload object
has the ownerReference to the StatefulSet:
1. Kueue doesn't keep the StatefulSet as deactivated
2. Kueue marks the Workload as Finished if all StatefulSet's Pods are deleted
3. changing the "queue-name" label could occasionally result in the StatefulSet getting stuck

Copy link
Contributor

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I have revisted this PR. I take it.

cc @sohankunkerkar I know you prepared the alternative, but I think we need to go with one. Since both PRs implement pretty much the same approach it is a great signal the approach makes sense, because both of you converged on the same approach. Thank you folks 👍

/lgtm
/approve
/cherrypick release-0.14
/cherrypick release-0.13

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 5, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 0dfe16eb2f5ecd05a0be0160bf60fd404f68d48d

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mbobrovskyi, mimowo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 5, 2025
@mimowo
Copy link
Contributor

mimowo commented Dec 5, 2025

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 5, 2025
@mimowo
Copy link
Contributor

mimowo commented Dec 5, 2025

@k8s-ci-robot k8s-ci-robot merged commit 98c941f into kubernetes-sigs:main Dec 5, 2025
28 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.16 milestone Dec 5, 2025
@mimowo
Copy link
Contributor

mimowo commented Dec 5, 2025

/cherrypick release-0.14
/cherrypick release-0.15

@k8s-infra-cherrypick-robot
Copy link
Contributor

@mimowo: new pull request created: #8102

In response to this:

/cherrypick release-0.14
/cherrypick release-0.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-infra-cherrypick-robot
Copy link
Contributor

@mimowo: new pull request created: #8103

In response to this:

/cherrypick release-0.14
/cherrypick release-0.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mimowo
Copy link
Contributor

mimowo commented Dec 5, 2025

/cherrypick release-0.15

@k8s-infra-cherrypick-robot
Copy link
Contributor

@mimowo: new pull request created: #8105

In response to this:

/cherrypick release-0.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

5 participants