Skip to content

fix(client): propagate SSE POST errors to caller instead of hanging (#2110)#2483

Open
fede-kamel wants to merge 1 commit into
modelcontextprotocol:mainfrom
fede-kamel:fix/2110-http-error-propagation
Open

fix(client): propagate SSE POST errors to caller instead of hanging (#2110)#2483
fede-kamel wants to merge 1 commit into
modelcontextprotocol:mainfrom
fede-kamel:fix/2110-http-error-propagation

Conversation

@fede-kamel
Copy link
Copy Markdown

@fede-kamel fede-kamel commented Apr 20, 2026

Summary

Fixes the remaining client-side hang from #2110, in sse_client.

_send_message (the message-POST coroutine) called response.raise_for_status() with no handler. On a non-2xx (401/403/404/5xx) or a network error, the exception propagated into the post_writer task group and was swallowed by its except Exception: logger.exception("Error in post_writer"). The failure never reached the read stream, so a caller blocked on read_stream.receive() — e.g. ClientSession.initialize()hung forever.

The fix catches httpx.HTTPError inside _send_message and forwards it to read_stream_writer, the pattern stdio.py and websocket.py already use, and that streamable_http.py uses for its >= 400 responses. The caller now receives the error promptly.

Scope note

This PR was originally broader. Since it was opened, main independently landed the equivalent fix for streamable_http_client (the status_code >= 400 branch in _handle_post_request now sends a JSONRPCError to the read stream with data={"http_status": ...}). That half is therefore dropped here — this is rebased on current main and scoped to the one remaining swallow path: sse_client. It's also distinct from #2340, which addresses the sse_reader/connection path rather than the message-POST path.

Verification

  • New regression test test_sse_client_post_error_propagates_to_caller: drives a real SSE handshake whose message POST returns 503 and asserts the caller receives the HTTPStatusError via the read stream within a bounded timeout.
  • Negative control: the test times out (fails) against the unpatched client, and passes with the fix — confirming it reproduces the hang.
  • ./scripts/test: full suite green, 100.00% coverage (branch), strict-no-cover clean.
  • ruff format / ruff check / pyright: clean.

Closes #2110.

@fede-kamel
Copy link
Copy Markdown
Author

Hi @Kludex @felixweinberger — friendly ping for review. This fixes #2110 (HTTP transport swallowing errors → client hangs). The fix is small (~50 LOC of behavior change), and there's a substantial test suite to back it: 10 substantive tests including a negative control where I reverted the fix and confirmed all 10 hang against unpatched main. CI green, no conflicts.

cc @chriscasola since you reported the original issue — happy to get any input.

Would really appreciate getting a reviewer assigned when there's bandwidth. Thanks!

@fede-kamel fede-kamel force-pushed the fix/2110-http-error-propagation branch from 5b35244 to bb47a5d Compare May 17, 2026 16:49
@fede-kamel
Copy link
Copy Markdown
Author

Hey @maxisbey — could I get your eyes on this? You filed #2110 originally; this PR propagates the swallowed httpx errors (the root cause flagged in the issue) and preserves the HTTP status in ErrorData.data.

Just rebased onto current main (clean, no conflicts with #2560's SSEError import refactor) and re-ran the full tests/shared/test_sse.py + tests/shared/test_streamable_http.py suite locally — 99 passed, including the network-error regression guard. ~50 LOC of behavior change with 10 substantive tests + negative control.

Happy to backport to v1.x if useful — saw the recent [v1.x] activity. Closes #2110.

@fede-kamel
Copy link
Copy Markdown
Author

fede-kamel commented May 25, 2026

Hi @maxisbey, @Kludex, @felixweinberger — would love your eyes on this when you have a moment 🙏

This PR takes a fresh swing at #2110 (HTTP transport swallowing non-2xx status codes and hanging the client). Current state:

  • ✅ All 25 CI checks green — full test matrix across Python 3.10–3.14 on Ubuntu and Windows, plus client/server conformance and pre-commit.
  • ✅ Rebased on main, no conflicts, ready to merge.

Happy to iterate on anything — feedback very welcome, and thank you all for your work on this SDK!

@fede-kamel fede-kamel force-pushed the fix/2110-http-error-propagation branch from b2ed978 to 31c4268 Compare June 3, 2026 15:26
@fede-kamel fede-kamel changed the title fix(client): propagate HTTP transport errors to caller (#2110) fix(client): propagate SSE POST errors to caller instead of hanging (#2110) Jun 3, 2026
In `sse_client`, the message-POST coroutine `_send_message` called
`response.raise_for_status()` with no handler. When the server returned a
non-2xx (401/403/404/5xx) or the POST hit a network error, the exception
propagated into the `post_writer` task group and was swallowed by its
`except Exception: logger.exception("Error in post_writer")`. The failure
was never delivered through the read stream, so a caller blocked on
`read_stream.receive()` (e.g. `ClientSession.initialize()`) hung forever.

Catch `httpx.HTTPError` inside `_send_message` and forward it to
`read_stream_writer`, the same pattern stdio.py and websocket.py already
use, and that `streamable_http.py` uses for its >= 400 responses. The
caller now receives the error promptly instead of deadlocking.

Adds an in-process regression test (matching the modelcontextprotocol#2765 harness) whose
message POST returns 503; it asserts the caller receives the
`HTTPStatusError` via the read stream within a bounded timeout, and fails
(times out) against the unpatched client.

Refs modelcontextprotocol#2110
@fede-kamel fede-kamel force-pushed the fix/2110-http-error-propagation branch from 31c4268 to ef8edd6 Compare June 3, 2026 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTTP transport swallows non-2xx status codes causing client to hang

1 participant