Skip to content

Streamable HTTP client: slow reception of large SSE tool responses (fromLineSubscriber line-assembly bottleneck) #1042

Description

@TaJoal

Summary

When a tool returns a large response over the Streamable HTTP client transport, the client takes ~5s to receive a ~4 MB body that curl/HttpClient.ofString() reads in ~0.4s. The bottleneck is the client-side SSE body reading in ResponseSubscribers.sseToBodySubscriber, not the server, network, or JSON parsing.

Environment

  • io.modelcontextprotocol.sdk:mcp-core / mcp 2.0.0 (latest)
  • JDK 25, Reactor (via SDK)
  • Server: Spring AI 2.0.0 mcp-spring-webmvc (HttpServletStreamableServerTransportProvider) returning a single large SSE message event — one compact-JSON data: line (~4.17 MB)
  • Client: HttpClientStreamableHttpTransport (McpSyncClient.callTool)

Measurements (steady-state, 3 runs)

Path Time to receive ~4.17 MB
McpSyncClient.callTool (this SDK) ~5,300 ms
Same payload via curl / HttpClient BodyHandlers.ofString() ~0.4–0.75 s
Jackson parse of the received JSON < 70 ms

→ ~10–13× slower than a plain one-shot read of the identical bytes.

Root cause analysis

The response body is effectively a single huge data: line (compact JSON has no newlines). ResponseSubscribers.sseToBodySubscriber uses HttpResponse.BodySubscribers.fromLineSubscriber(...); assembling that one ~4 MB line through the JDK line subscriber is the cost.

Things I tried:

  • Changing SseLineSubscriber demand from upstream().request(1) to request(Long.MAX_VALUE)no improvement (so it isn't per-line backpressure).
  • Replacing the body subscriber with a streaming byte-level SSE parser (BodySubscribers.fromSubscriber, unbounded demand, accumulate ByteBuffers, split on \n\n event boundaries, decode each event once) → ~0.4 s (≈13×).

Important constraint (must stay streaming)

A whole-body ofString read fixes the speed but breaks progress: the server interleaves notifications/progress on the same POST response SSE stream before the final result. So the fix must remain a streaming parser that emits each SSE event as its boundary arrives (a byte-level parser does this while still avoiding the line-assembly cost).

Questions

  1. Is there a recommended approach/workaround for large tool responses on the client that we're missing?
  2. Would you accept a PR replacing fromLineSubscriber with a streaming byte-level SSE parser in ResponseSubscribers.sseToBodySubscriber (preserving incremental event emission)?
  3. Is this related to Support application/json responses in Streamable HTTP transport (opt-in JSON response mode) #844 (opt-in application/json response mode)? That would avoid SSE framing on the server side, but clients receiving SSE responses would still benefit from this fix.

Happy to open a PR with the streaming parser + a benchmark if that's welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Moderate issues affecting some users, edge cases, potentially valuable featurearea/clientarea/transportenhancementNew feature or requestwaiting for userWaiting for user feedback or more details
    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions