Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

fixes #59218

@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Dec 7, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several new metrics to improve the observability of Ray Serve. Specifically, it adds a histogram to measure the routing stats propagation delay from replicas to the controller, a gauge to monitor proxy health status, and another gauge to track when a proxy is in a draining state. The changes are implemented across deployment_state.py, proxy.py, and proxy_state.py, and are accompanied by new tests in test_metrics.py and test_metrics_2.py. The implementation is solid and the new metrics are a great addition. I've identified a minor issue in one of the new tests that could lead to flakiness and have suggested a correction.

Signed-off-by: abrar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] add debugging metrics to ray serve

2 participants