Skip to content

Conversation

@rickbrouwer
Copy link
Member

@rickbrouwer rickbrouwer commented Nov 11, 2025

When a ScaledObject is scaled to zero and the scaler encounters an error, the fallback mechanism is only triggered during HPA metric requests via GetScaledObjectMetrics, but not during the polling loop in getScaledObjectState. This caused the ScaledObject to remain at zero replicas indefinitely when errors occurred, as the polling loop would continuously fail without activating the fallback, preventing HPA from ever being consulted.

Fallback now works in both, HPA requests and polling interval checks.

The only question I have is whether it's by design that it doesn't switch to Fallback with 0 replicas, but I can't think of that myself. Please consider this when considering whether this was a deliberate choice. If not, I believe this PR resolves the issue.

Some extra new info;

It turned out to be a more change than I anticipated. I initially thought I'd add a simple fallback to getScalerState, but it turned out that scalingModifiers started having issues.
Therefore, I had to make some further adjustments to ensure the fallback worked properly from all positions.

Checklist

Fixes #7239

@rickbrouwer rickbrouwer requested a review from a team as a code owner November 11, 2025 09:22
@github-actions
Copy link

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

  • Add an entry in our changelog in alphabetical order and link related issue
  • Update the documentation, if needed
  • Add unit & e2e tests for your changes
  • GitHub checks are passing
  • Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

@keda-automation keda-automation requested a review from a team November 11, 2025 09:22
@snyk-io
Copy link

snyk-io bot commented Nov 11, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 11, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer rickbrouwer marked this pull request as draft November 11, 2025 12:32
@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 12, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 12, 2025

/run-e2e internals
Update: You can check the progress here

Signed-off-by: Rick Brouwer <[email protected]>
@rickbrouwer rickbrouwer marked this pull request as ready for review November 12, 2025 15:11
@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 12, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 12, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 18, 2025

/run-e2e internals
Update: You can check the progress here

Signed-off-by: Rick Brouwer <[email protected]>
@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 19, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 19, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Nov 23, 2025

/run-e2e internals
Update: You can check the progress here

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a critical bug where ScaledObjects remained at zero replicas indefinitely when scalers encountered errors, because fallback was only triggered during HPA metric requests but not during the polling loop. The fix ensures fallback works consistently in both paths, enabling proper scaling from zero even when errors occur.

  • Introduced processMetricsWithFallback helper function to consolidate fallback logic across HPA requests and polling interval checks
  • Extended scalerState struct to track fallback activation state and fallback metrics
  • Modified getScalerState to treat fallback-active scalers as active, enabling scale-up from zero replicas
  • Added comprehensive test coverage for scaling from zero with fallback

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
pkg/scaling/scale_handler.go Adds processMetricsWithFallback helper function and integrates fallback logic into both GetScaledObjectMetrics (HPA path) and getScalerState (polling path); moves scalerState struct definition earlier and extends it with fallback fields; ensures fallback-active scalers are considered active to enable scaling from zero
tests/internals/fallback/fallback.go Adds TestFallbackFromZero test case to verify that fallback correctly triggers and scales from 0 to fallback replica count when metrics server is unavailable; integrates new test into main test suite
CHANGELOG.md Documents the fix for applying fallback in polling loop to enable scaling from zero, referencing issue #7239

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rick Brouwer <[email protected]>
@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 3, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 3, 2025

/run-e2e internals
Update: You can check the progress here

Signed-off-by: Rick Brouwer <[email protected]>
@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 3, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 3, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 4, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 5, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 6, 2025

/run-e2e internals
Update: You can check the progress here

@JorTurFer
Copy link
Member

JorTurFer commented Dec 7, 2025

/run-e2e internals
Update: You can check the progress here

@keda-automation keda-automation requested a review from a team December 8, 2025 11:51
@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 8, 2025

/run-e2e internals
Update: You can check the progress here

@rickbrouwer
Copy link
Member Author

rickbrouwer commented Dec 9, 2025

/run-e2e internals
Update: You can check the progress here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KEDA fallback does not scale from 0 to 1 when metrics cannot be fetched and minReplicaCount is 0

2 participants