Skip to content

[serve] deflake backpressure test#63005

Open
kamil-kaczmarek wants to merge 5 commits intomasterfrom
kk/backpressure_test
Open

[serve] deflake backpressure test#63005
kamil-kaczmarek wants to merge 5 commits intomasterfrom
kk/backpressure_test

Conversation

@kamil-kaczmarek
Copy link
Copy Markdown
Contributor

@kamil-kaczmarek kamil-kaczmarek commented Apr 28, 2026

This pull request moves wrap_queued_request up one level to route_and_send_request in the router.py. It is to ensure that the num_queued_requests remains incremented across all routing retries, preventing oscillation when a replica rejects a request.

Details:

  • num_queued_requests now reflects the requests this router is actively trying to place.
  • In route_and_send_request, the wrap_queued_request context now wraps the entire retry loop, so the num_queued_requests stays incremented for the full duration of all retries instead of being reset on each retry.
  • argument resolution must not be counted as queue time -> _resolve_request_arguments (along with the _objref_resolution_latency_ms observation and ActorDiedError → upstream_crash_error handling) is now called in route_and_send_request before the wrap_queued_request context is opened.

Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
@kamil-kaczmarek kamil-kaczmarek self-assigned this Apr 28, 2026
@kamil-kaczmarek kamil-kaczmarek requested a review from a team as a code owner April 28, 2026 23:18
@kamil-kaczmarek kamil-kaczmarek added serve Ray Serve Related Issue go add ONLY when ready to merge, run all tests labels Apr 28, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request moves the wrap_queued_request context manager to wrap the entire retry loop in route_and_send_request to prevent metric oscillation during replica rejections. However, a review comment points out that this change introduces a regression in metrics accuracy because the counter now increments before request arguments are resolved. The reviewer suggests moving the argument resolution logic before the context manager to ensure that queue metrics accurately reflect pending work and do not include time spent waiting for upstream dependencies.

Comment thread python/ray/serve/_private/router.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1c26d69. Configure here.

Comment thread python/ray/serve/_private/router.py
@kamil-kaczmarek kamil-kaczmarek marked this pull request as draft April 28, 2026 23:22
Signed-off-by: Kamil Kaczmarek <kamil@anyscale.com>
@kamil-kaczmarek kamil-kaczmarek marked this pull request as ready for review April 28, 2026 23:46
@kamil-kaczmarek kamil-kaczmarek changed the title [serve] Fix flaky backpressure test [serve] deflake backpressure test Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant