Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

@abrarsheikh abrarsheikh commented Dec 7, 2025

fixes #59218

import asyncio
from ray import serve
from typing import List

@serve.deployment(max_ongoing_requests=1000)
class MyDeployment:
    async def __call__(self) -> List[int]:
        await asyncio.sleep(.01)
        return 1

app = MyDeployment.bind()

testing with 100 users

Metric With Change Master Δ (Master – With Change)
Requests 128,197 128,136 –61
Fails 1,273 1,363 +90
Median (ms) 170 170 0
95%ile (ms) 260 260 0
99%ile (ms) 310 320 +10 ms
Average (ms) 178.66 178.99 +0.33 ms
Min (ms) 9 8 –1 ms
Max (ms) 1,171 2,995 +1,824 ms
Average size (bytes) 0.99 0.99 0
Current RPS 538.8 558.2 +19.4
Current Failures/s 0 0 0

@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Dec 7, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new metric, serve_queue_wait_time_ms, to measure the time requests spend in the router queue. The implementation adds logic to record this latency in RequestRouter and FIFOMixin, and includes a new test case to verify the metric's correctness.

My review focuses on improving code maintainability by addressing duplication and ensuring consistency in metric tagging. I've also identified a bug in the new test case where time units are being compared incorrectly.

Key feedback points:

  • Refactor duplicated metric recording logic into a helper method.
  • Ensure consistent metric tagging for the new histogram.
  • Correct the assertion in the new test to use the correct time unit (milliseconds).

@abrarsheikh abrarsheikh marked this pull request as ready for review December 15, 2025 07:58
@abrarsheikh abrarsheikh requested review from a team as code owners December 15, 2025 07:58
@ray-gardener ray-gardener bot added serve Ray Serve Related Issue observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling labels Dec 15, 2025
Signed-off-by: abrar <[email protected]>
Copy link
Contributor

@harshit-anyscale harshit-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment from cursor seems legit, else LGTM

Signed-off-by: abrar <[email protected]>
Signed-off-by: abrar <[email protected]>
@abrarsheikh abrarsheikh merged commit 31a0e1e into master Dec 16, 2025
6 checks passed
@abrarsheikh abrarsheikh deleted the 59218-abrar-router_latency branch December 16, 2025 05:34
kriyanshii pushed a commit to kriyanshii/ray that referenced this pull request Dec 16, 2025
fixes ray-project#59218

```python
import asyncio
from ray import serve
from typing import List

@serve.deployment(max_ongoing_requests=1000)
class MyDeployment:
    async def __call__(self) -> List[int]:
        await asyncio.sleep(.01)
        return 1

app = MyDeployment.bind()

```

testing with 100 users

Metric | With Change | Master | Δ (Master – With Change)
-- | -- | -- | --
Requests | 128,197 | 128,136 | –61
Fails | 1,273 | 1,363 | +90
Median (ms) | 170 | 170 | 0
95%ile (ms) | 260 | 260 | 0
99%ile (ms) | 310 | 320 | +10 ms
Average (ms) | 178.66 | 178.99 | +0.33 ms
Min (ms) | 9 | 8 | –1 ms
Max (ms) | 1,171 | 2,995 | +1,824 ms
Average size (bytes) | 0.99 | 0.99 | 0
Current RPS | 538.8 | 558.2 | +19.4
Current Failures/s | 0 | 0 | 0

---------

Signed-off-by: abrar <[email protected]>
Signed-off-by: kriyanshii <[email protected]>
cszhu pushed a commit that referenced this pull request Dec 17, 2025
fixes #59218

```python
import asyncio
from ray import serve
from typing import List

@serve.deployment(max_ongoing_requests=1000)
class MyDeployment:
    async def __call__(self) -> List[int]:
        await asyncio.sleep(.01)
        return 1

app = MyDeployment.bind()

```

testing with 100 users

Metric | With Change | Master | Δ (Master – With Change)
-- | -- | -- | --
Requests | 128,197 | 128,136 | –61
Fails | 1,273 | 1,363 | +90
Median (ms) | 170 | 170 | 0
95%ile (ms) | 260 | 260 | 0
99%ile (ms) | 310 | 320 | +10 ms
Average (ms) | 178.66 | 178.99 | +0.33 ms
Min (ms) | 9 | 8 | –1 ms
Max (ms) | 1,171 | 2,995 | +1,824 ms
Average size (bytes) | 0.99 | 0.99 | 0
Current RPS | 538.8 | 558.2 | +19.4
Current Failures/s | 0 | 0 | 0

---------

Signed-off-by: abrar <[email protected]>
zzchun pushed a commit to zzchun/ray that referenced this pull request Dec 18, 2025
fixes ray-project#59218

```python
import asyncio
from ray import serve
from typing import List

@serve.deployment(max_ongoing_requests=1000)
class MyDeployment:
    async def __call__(self) -> List[int]:
        await asyncio.sleep(.01)
        return 1

app = MyDeployment.bind()

```

testing with 100 users

Metric | With Change | Master | Δ (Master – With Change)
-- | -- | -- | --
Requests | 128,197 | 128,136 | –61
Fails | 1,273 | 1,363 | +90
Median (ms) | 170 | 170 | 0
95%ile (ms) | 260 | 260 | 0
99%ile (ms) | 310 | 320 | +10 ms
Average (ms) | 178.66 | 178.99 | +0.33 ms
Min (ms) | 9 | 8 | –1 ms
Max (ms) | 1,171 | 2,995 | +1,824 ms
Average size (bytes) | 0.99 | 0.99 | 0
Current RPS | 538.8 | 558.2 | +19.4
Current Failures/s | 0 | 0 | 0

---------

Signed-off-by: abrar <[email protected]>
Yicheng-Lu-llll pushed a commit to Yicheng-Lu-llll/ray that referenced this pull request Dec 22, 2025
fixes ray-project#59218

```python
import asyncio
from ray import serve
from typing import List

@serve.deployment(max_ongoing_requests=1000)
class MyDeployment:
    async def __call__(self) -> List[int]:
        await asyncio.sleep(.01)
        return 1

app = MyDeployment.bind()

```

testing with 100 users

Metric | With Change | Master | Δ (Master – With Change)
-- | -- | -- | --
Requests | 128,197 | 128,136 | –61
Fails | 1,273 | 1,363 | +90
Median (ms) | 170 | 170 | 0
95%ile (ms) | 260 | 260 | 0
99%ile (ms) | 310 | 320 | +10 ms
Average (ms) | 178.66 | 178.99 | +0.33 ms
Min (ms) | 9 | 8 | –1 ms
Max (ms) | 1,171 | 2,995 | +1,824 ms
Average size (bytes) | 0.99 | 0.99 | 0
Current RPS | 538.8 | 558.2 | +19.4
Current Failures/s | 0 | 0 | 0

---------

Signed-off-by: abrar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] add debugging metrics to ray serve

4 participants