Skip to content

fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops#391

Open
imsherrill wants to merge 11 commits intoTanStack:mainfrom
imsherrill:fix/thinking-blocks-per-step
Open

fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops#391
imsherrill wants to merge 11 commits intoTanStack:mainfrom
imsherrill:fix/thinking-blocks-per-step

Conversation

@imsherrill
Copy link
Copy Markdown
Contributor

@imsherrill imsherrill commented Mar 20, 2026

Summary

  • Fixes thinking blocks being merged into a single ThinkingPart per message instead of one per step
  • Preserves thinking content and Anthropic signatures in server-side message history for multi-turn context
  • Adds interleaved-thinking-2025-05-14 beta header when thinking is enabled — required by Anthropic API for thinking on tool-result follow-up turns
  • Each thinking step now tracked by stepId through the full stack: adapter → engine → processor → UIMessage

Changes

@tanstack/ai:

  • ThinkingPart gains optional stepId and signature fields
  • ModelMessage gains optional thinking array for multi-turn context
  • StepFinishedEvent gains optional signature field
  • StreamProcessor handles STEP_STARTED, tracks thinking per-step via Map<stepId, content>
  • TextEngine accumulates thinking + signatures per iteration, includes them in assistant messages
  • buildAssistantMessages preserves ThinkingParts into ModelMessage.thinking
  • updateThinkingPart keys on stepId instead of replacing a single part
  • onThinkingUpdate callback gains stepId parameter

@tanstack/ai-anthropic:

  • Captures signature_delta stream events for thinking block signatures
  • Emits final STEP_FINISHED with signature on content_block_stop
  • Includes thinking blocks with signatures in formatMessages for multi-turn history
  • Passes betas: ['interleaved-thinking-2025-05-14'] when thinking is enabled

@tanstack/ai-client:

  • onThinkingUpdate callback updated for new stepId parameter

Test plan

  • 634 tests pass in @tanstack/ai (4 new tests for multi-step thinking)
  • 198 tests pass in @tanstack/ai-client
  • Types and eslint clean
  • Manual test: Anthropic Sonnet 4.5 — thinking on all 4 turns of multi-step tool flow
  • Manual test: OpenAI o4-mini — reasoning on multiple turns
  • Manual test: client tools with approval — no infinite loop, completes normally

Closes #340

Summary by CodeRabbit

  • New Features

    • Per-step “thinking” blocks are tracked, preserved (with optional provider signatures), and attached to assistant messages so multi-step reasoning is replayed in correct order.
    • Streaming now emits step-aware thinking updates so ongoing step-by-step reasoning is visible during streams.
  • Documentation

    • Added changelog describing per-step thinking behavior.
  • Tests

    • Test suites expanded to cover step-aware thinking accumulation and streaming behavior.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds per-step tracking and optional provider signatures for "thinking" blocks across streaming, processor, engine, adapter, and client wiring so thinking content/signatures are emitted, accumulated, and replayed per stepId instead of being merged across steps.

Changes

Per-Step Thinking Tracking & Signature Preservation

Layer / File(s) Summary
Type Shape
packages/typescript/ai/src/types.ts
ModelMessage gains optional thinking?: Array<{ content: string; signature?: string }>; ThinkingPart adds stepId?: string and signature?: string; StepFinishedEvent adds optional signature.
Stream State Shape
packages/typescript/ai/src/activities/chat/stream/types.ts
Replaces thinkingContent: string with per-step fields: thinkingSteps: Map<string,string>, thinkingStepSignatures: Map<string,string>, thinkingStepOrder: string[], `currentThinkingStepId: string
Stream Processor Core
packages/typescript/ai/src/activities/chat/stream/processor.ts
Introduces STEP_STARTED handling, per-step accumulation (thinkingSteps, thinkingStepOrder, thinkingStepSignatures), pendingThinkingStepId, updates getState()/getResult() to concatenate ordered steps, and changes onThinkingUpdate event signature to (messageId, stepId, content).
UI Message Updaters
packages/typescript/ai/src/activities/chat/stream/message-updaters.ts
updateThinkingPart now requires stepId, matches/updates thinking parts by stepId, and sets signature when provided.
Chat Activity / Engine
packages/typescript/ai/src/activities/chat/index.ts, .../chat/messages.ts
TextEngine tracks accumulatedThinking, currentThinkingContent, and currentThinkingSignature; finalizes thinking at STEP_STARTED/STEP_FINISHED; buildAssistantMessages attaches thinking array to assistant ModelMessage when present.
Anthropic Streaming Adapter
packages/typescript/ai-anthropic/src/adapters/text.ts
Conditionally enables beta interleaved-thinking-2025-05-14 for streaming when modelOptions.thinking has budget; captures signature_delta into accumulatedSignature, resets on thinking block start, appends signature deltas, and emits STEP_FINISHED with signature on content_block_stop; preserves thinking blocks with signatures in formatted messages.
Client Wiring
packages/typescript/ai-client/src/chat-client.ts
StreamProcessor onThinkingUpdate handler signature updated to accept (messageId, _stepId, content) and forwards the extra arg.
Tool Calls Minor Fix
packages/typescript/ai/src/activities/chat/tools/tool-calls.ts
ToolCallManager now derives tool call name from a safe runtimeEvent (runtimeEvent.toolCallName ?? runtimeEvent.toolName) when creating a tool-call entry.
Tests & Changelog
packages/typescript/ai/tests/*, .changeset/thinking-blocks-per-step.md
Tests updated to pass stepId, assert step-specific thinking parts and signatures, add STEP_STARTED helpers and stream-processor tests; changelog documents per-step thinking/signature behavior and wiring changes.

Sequence Diagram

sequenceDiagram
    participant User
    participant Model
    participant StreamProcessor
    participant TextEngine
    participant Context

    User->>Model: user query / turn n
    Model->>StreamProcessor: STEP_STARTED (stepId)
    Model->>StreamProcessor: STEP_FINISHED (stepId, thinking_delta, signature_delta)
    StreamProcessor->>TextEngine: onThinkingUpdate(msgId, stepId, thinking_content)
    TextEngine->>Context: accumulate thinking[stepId] + signature[stepId]

    StreamProcessor->>TextEngine: emit tool_use / assistant message (attach thinking if present)
    TextEngine->>Context: store assistant message with thinking array

    User->>Model: next turn (with prior tool_result)
    Context->>Model: include prior thinking entries + signatures
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • TanStack/ai#275: Modifies Anthropic streaming adapter and stream chunk handling related to STEP_STARTED/STEP_FINISHED and thinking emission; closely related.

Suggested reviewers

  • AlemTuzlak

Poem

🐰 I hop through steps both short and long,
My signatures trail like a tiny song,
No thought gets lost between each call,
Per-step thinking remembers all,
Hooray — tool loops won't drop the throng!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main fix: ensuring thinking blocks appear on turn 2+ in Anthropic tool loops, which is the core issue resolved by this PR.
Description check ✅ Passed The PR description comprehensively covers the changes, objectives, and test plan, but is missing explicit checkboxes completion status for the template's release impact section.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #340: per-step thinking tracking via stepId, multi-turn thinking preservation, Anthropic beta header support, and comprehensive test coverage across all packages.
Out of Scope Changes check ✅ Passed All code changes are directly aligned with fixing thinking blocks in multi-turn tool loops; the tool-calls.ts change is a minor supporting fix for name resolution consistency.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nx-cloud
Copy link
Copy Markdown

nx-cloud Bot commented Mar 20, 2026

View your CI Pipeline Execution ↗ for commit c53d501

Command Status Duration Result
nx run-many --targets=build --exclude=examples/** ✅ Succeeded 1m 27s View ↗

☁️ Nx Cloud last updated this comment at 2026-05-06 02:06:40 UTC

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Mar 20, 2026

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/@tanstack/ai@391

@tanstack/ai-anthropic

npm i https://pkg.pr.new/@tanstack/ai-anthropic@391

@tanstack/ai-client

npm i https://pkg.pr.new/@tanstack/ai-client@391

@tanstack/ai-code-mode

npm i https://pkg.pr.new/@tanstack/ai-code-mode@391

@tanstack/ai-code-mode-skills

npm i https://pkg.pr.new/@tanstack/ai-code-mode-skills@391

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/@tanstack/ai-devtools-core@391

@tanstack/ai-elevenlabs

npm i https://pkg.pr.new/@tanstack/ai-elevenlabs@391

@tanstack/ai-event-client

npm i https://pkg.pr.new/@tanstack/ai-event-client@391

@tanstack/ai-fal

npm i https://pkg.pr.new/@tanstack/ai-fal@391

@tanstack/ai-gemini

npm i https://pkg.pr.new/@tanstack/ai-gemini@391

@tanstack/ai-grok

npm i https://pkg.pr.new/@tanstack/ai-grok@391

@tanstack/ai-groq

npm i https://pkg.pr.new/@tanstack/ai-groq@391

@tanstack/ai-isolate-cloudflare

npm i https://pkg.pr.new/@tanstack/ai-isolate-cloudflare@391

@tanstack/ai-isolate-node

npm i https://pkg.pr.new/@tanstack/ai-isolate-node@391

@tanstack/ai-isolate-quickjs

npm i https://pkg.pr.new/@tanstack/ai-isolate-quickjs@391

@tanstack/ai-ollama

npm i https://pkg.pr.new/@tanstack/ai-ollama@391

@tanstack/ai-openai

npm i https://pkg.pr.new/@tanstack/ai-openai@391

@tanstack/ai-openrouter

npm i https://pkg.pr.new/@tanstack/ai-openrouter@391

@tanstack/ai-preact

npm i https://pkg.pr.new/@tanstack/ai-preact@391

@tanstack/ai-react

npm i https://pkg.pr.new/@tanstack/ai-react@391

@tanstack/ai-react-ui

npm i https://pkg.pr.new/@tanstack/ai-react-ui@391

@tanstack/ai-solid

npm i https://pkg.pr.new/@tanstack/ai-solid@391

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/@tanstack/ai-solid-ui@391

@tanstack/ai-svelte

npm i https://pkg.pr.new/@tanstack/ai-svelte@391

@tanstack/ai-vue

npm i https://pkg.pr.new/@tanstack/ai-vue@391

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/@tanstack/ai-vue-ui@391

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/@tanstack/preact-ai-devtools@391

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/@tanstack/react-ai-devtools@391

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/@tanstack/solid-ai-devtools@391

commit: 11249c3

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/typescript/ai/src/activities/chat/messages.ts (1)

181-186: ⚠️ Potential issue | 🟠 Major

Round-tripping through ModelMessage still strips thinking.

Line 185 starts serializing thinking into assistant ModelMessages, but modelMessageToUIMessage() below never materializes modelMessage.thinking back into ThinkingParts. Any history hydrated through normalizeToUIMessage() / ChatClient.append() will silently drop the thinking blocks and signatures, so a later convertMessagesToModelMessages() sends stripped context again.

Patch sketch
 export function modelMessageToUIMessage(
   modelMessage: ModelMessage,
   id?: string,
 ): UIMessage {
   const parts: Array<MessagePart> = []
+
+  if (modelMessage.role === 'assistant' && modelMessage.thinking?.length) {
+    for (const thinkingPart of modelMessage.thinking) {
+      parts.push({
+        type: 'thinking',
+        content: thinkingPart.content,
+        ...(thinkingPart.signature && {
+          signature: thinkingPart.signature,
+        }),
+      })
+    }
+  }
 
   // Handle tool results (when role is "tool") - only produce tool-result part,
   // not a text part (the content IS the tool result, not display text)
   if (modelMessage.role === 'tool' && modelMessage.toolCallId) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai/src/activities/chat/messages.ts` around lines 181 -
186, The assistant message builder adds a thinking field to ModelMessage (see
the push that adds thinking), but
modelMessageToUIMessage()/normalizeToUIMessage()/ChatClient.append() never
rehydrate modelMessage.thinking back into ThinkingPart objects and signatures,
causing round-trip loss; update modelMessageToUIMessage (and any
normalizeToUIMessage/ChatClient.append code paths) to detect
modelMessage.thinking and reconstruct the original ThinkingPart shape (including
reasoning text and signature fields) so that convertMessagesToModelMessages
receives intact thinking blocks on subsequent conversions.
🧹 Nitpick comments (3)
packages/typescript/ai/tests/stream-processor.test.ts (1)

778-830: Add one signature propagation regression here.

These new cases cover stepId, but none assert that a STEP_FINISHED.signature survives into the stored thinking part. That field is what Anthropic needs on the follow-up turn, so this branch is still unguarded by tests.

Test sketch
+    it('should persist signature on a thinking step', () => {
+      const processor = new StreamProcessor()
+      processor.prepareAssistantMessage()
+
+      processor.processChunk(
+        chunk('STEP_FINISHED', {
+          stepId: 'step-1',
+          delta: 'thinking...',
+          signature: 'sig-1',
+        }),
+      )
+
+      expect(processor.getMessages()[0]?.parts[0]).toEqual({
+        type: 'thinking',
+        content: 'thinking...',
+        stepId: 'step-1',
+        signature: 'sig-1',
+      })
+    })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai/tests/stream-processor.test.ts` around lines 778 -
830, Add assertions to the tests that verify STEP_FINISHED.signature is
propagated into the stored thinking parts and into the aggregated
state.thinking: when using StreamProcessor.processChunk with
ev.stepFinished(..., stepId) include a signature value and assert that (from
processor.getMessages()[0]!.parts filtered by type 'thinking') each thinking
part has the same signature (e.g., for 'step-1' and 'step-2'), and in the
getResult()/getState() concatenation case assert that state.thinking retains
each step's signature information (or that the per-step signatures are
accessible on the stored thinking parts); locate code via
StreamProcessor.processChunk, ev.stepFinished, processor.getMessages, and
processor.getState to add these checks.
packages/typescript/ai-anthropic/src/adapters/text.ts (2)

395-405: Thinking blocks without signatures are silently filtered out.

The loop only pushes thinking blocks that have a signature. This is correct per Anthropic's API requirements (signatures are mandatory for replaying thinking in multi-turn context), but a brief comment explaining this would help maintainability.

📝 Suggested comment
 if (message.thinking?.length) {
   for (const thinking of message.thinking) {
+    // Anthropic requires signatures to replay thinking blocks in multi-turn context;
+    // blocks without signatures cannot be included.
     if (thinking.signature) {
       contentBlocks.push({
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai-anthropic/src/adapters/text.ts` around lines 395 -
405, Add a short explanatory comment above the loop that filters thinking blocks
to clarify that only thinking entries with a signature are included because
Anthropic requires signatures to replay thinking in multi-turn context;
reference the variables message.thinking, thinking.signature, the
contentBlocks.push of AnthropicContentBlock, and note that omitting unsigned
thinking blocks is intentional for API compliance and not a bug.

292-295: Add a comment explaining the beta header requirement and type assertion.

The beta header interleaved-thinking-2025-05-14 is required to enable interleaved thinking in Claude models during tool-use conversations. Add a brief comment explaining why this beta is needed and why the as any cast is used (the Anthropic SDK types don't include betas in this parameter definition).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai-anthropic/src/adapters/text.ts` around lines 292 -
295, Add an inline comment above the conditional that sets betas explaining that
the beta header "interleaved-thinking-2025-05-14" is required to enable
interleaved thinking for Claude during tool-use conversations, and note that the
`as any` cast is used because the Anthropic SDK types do not include `betas` on
this parameter; update the block around the `thinkingBudget` conditional and the
`betas` property to include that short explanatory comment next to `betas` and
`as any` so future readers understand both the runtime requirement and the
type-workaround.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/typescript/ai/src/activities/chat/messages.ts`:
- Around line 168-169: The issue is that pendingThinking is buffered separately
from current.toolCalls, which loses the original interleaving (e.g., thinking,
tool-call, thinking) when constructing ModelMessage; change the buffering to a
single ordered stream of typed entries (e.g., an array of {type:
'thinking'|'toolCall', payload: ...}) instead of separate pendingThinking and
current.toolCalls so you can preserve insert order when emitting/serializing;
update all places that push to pendingThinking and to current.toolCalls (and the
duplicate buffering logic referenced around the 233-239 region) to push a typed
entry into the unified buffer and update the code that builds the assistant
ModelMessage to iterate this unified buffer and emit thinking/toolCall entries
in original sequence.

In `@packages/typescript/ai/src/activities/chat/stream/processor.ts`:
- Line 146: The field pendingThinkingStepId is not cleared in
resetStreamState(), causing stale step IDs to persist across streams; update the
resetStreamState() method to set this.pendingThinkingStepId = null so that
prepareAssistantMessage() starting a new stream cannot inherit an old id, and
ensure any other stream-reset paths also call resetStreamState() (or explicitly
clear pendingThinkingStepId) to avoid leaking the previous STEP_STARTED
association.

---

Outside diff comments:
In `@packages/typescript/ai/src/activities/chat/messages.ts`:
- Around line 181-186: The assistant message builder adds a thinking field to
ModelMessage (see the push that adds thinking), but
modelMessageToUIMessage()/normalizeToUIMessage()/ChatClient.append() never
rehydrate modelMessage.thinking back into ThinkingPart objects and signatures,
causing round-trip loss; update modelMessageToUIMessage (and any
normalizeToUIMessage/ChatClient.append code paths) to detect
modelMessage.thinking and reconstruct the original ThinkingPart shape (including
reasoning text and signature fields) so that convertMessagesToModelMessages
receives intact thinking blocks on subsequent conversions.

---

Nitpick comments:
In `@packages/typescript/ai-anthropic/src/adapters/text.ts`:
- Around line 395-405: Add a short explanatory comment above the loop that
filters thinking blocks to clarify that only thinking entries with a signature
are included because Anthropic requires signatures to replay thinking in
multi-turn context; reference the variables message.thinking,
thinking.signature, the contentBlocks.push of AnthropicContentBlock, and note
that omitting unsigned thinking blocks is intentional for API compliance and not
a bug.
- Around line 292-295: Add an inline comment above the conditional that sets
betas explaining that the beta header "interleaved-thinking-2025-05-14" is
required to enable interleaved thinking for Claude during tool-use
conversations, and note that the `as any` cast is used because the Anthropic SDK
types do not include `betas` on this parameter; update the block around the
`thinkingBudget` conditional and the `betas` property to include that short
explanatory comment next to `betas` and `as any` so future readers understand
both the runtime requirement and the type-workaround.

In `@packages/typescript/ai/tests/stream-processor.test.ts`:
- Around line 778-830: Add assertions to the tests that verify
STEP_FINISHED.signature is propagated into the stored thinking parts and into
the aggregated state.thinking: when using StreamProcessor.processChunk with
ev.stepFinished(..., stepId) include a signature value and assert that (from
processor.getMessages()[0]!.parts filtered by type 'thinking') each thinking
part has the same signature (e.g., for 'step-1' and 'step-2'), and in the
getResult()/getState() concatenation case assert that state.thinking retains
each step's signature information (or that the per-step signatures are
accessible on the stored thinking parts); locate code via
StreamProcessor.processChunk, ev.stepFinished, processor.getMessages, and
processor.getState to add these checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19b0cdc6-d9a2-4d83-bb64-7a1aef5b7b29

📥 Commits

Reviewing files that changed from the base of the PR and between c3583e3 and c042c55.

📒 Files selected for processing (10)
  • packages/typescript/ai-anthropic/src/adapters/text.ts
  • packages/typescript/ai-client/src/chat-client.ts
  • packages/typescript/ai/src/activities/chat/index.ts
  • packages/typescript/ai/src/activities/chat/messages.ts
  • packages/typescript/ai/src/activities/chat/stream/message-updaters.ts
  • packages/typescript/ai/src/activities/chat/stream/processor.ts
  • packages/typescript/ai/src/activities/chat/stream/types.ts
  • packages/typescript/ai/src/types.ts
  • packages/typescript/ai/tests/message-updaters.test.ts
  • packages/typescript/ai/tests/stream-processor.test.ts

Comment on lines +168 to 169
let pendingThinking: Array<{ content: string; signature?: string }> = []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This still loses thinking/tool-call order within a single turn.

pendingThinking is buffered independently from current.toolCalls, so a sequence like [thinking(step-1), tool-call(a), thinking(step-2), tool-call(b)] gets flattened into one assistant ModelMessage with thinking: [step-1, step-2] and toolCalls: [a, b]. If a provider emits multiple thinking blocks around multiple tool calls in one turn, the original interleaving is gone and the next request cannot faithfully replay that history.

Also applies to: 233-239

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai/src/activities/chat/messages.ts` around lines 168 -
169, The issue is that pendingThinking is buffered separately from
current.toolCalls, which loses the original interleaving (e.g., thinking,
tool-call, thinking) when constructing ModelMessage; change the buffering to a
single ordered stream of typed entries (e.g., an array of {type:
'thinking'|'toolCall', payload: ...}) instead of separate pendingThinking and
current.toolCalls so you can preserve insert order when emitting/serializing;
update all places that push to pendingThinking and to current.toolCalls (and the
duplicate buffering logic referenced around the 233-239 region) to push a typed
entry into the unified buffer and update the code that builds the assistant
ModelMessage to iterate this unified buffer and emit thinking/toolCall entries
in original sequence.

Comment thread packages/typescript/ai/src/activities/chat/stream/processor.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
packages/typescript/smoke-tests/e2e/tests/thinking.spec.ts (2)

48-51: Tighten types on getMessages and filter callbacks.

The Array<any> return type and the (p: any) predicates throughout the tests disable type-checking on the assertions (e.g. assistantMessage.parts, firstStep.content, firstStep.signature). Since UIMessage is already exported from @tanstack/ai-client and used in mock.tsx, importing it here would catch shape regressions (e.g. if ThinkingPart.signature is renamed) at compile time rather than flakily at runtime.

Suggested change
-import { test, expect, type Page } from '@playwright/test'
+import { expect, test } from '@playwright/test'
+import type { Page } from '@playwright/test'
+import type { UIMessage } from '@tanstack/ai-client'
@@
-async function getMessages(page: Page): Promise<Array<any>> {
+async function getMessages(page: Page): Promise<Array<UIMessage>> {
   const jsonContent = await page.locator('#messages-json-content').textContent()
   return JSON.parse(jsonContent || '[]')
 }

This also addresses the ESLint sort-imports and import/consistent-type-specifier-style hints on line 1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/smoke-tests/e2e/tests/thinking.spec.ts` around lines 48 -
51, Tighten the types by importing UIMessage from "@tanstack/ai-client" and
update getMessages to return Promise<UIMessage[]> (change the function signature
and JSON parse typing), then replace loose any usages in filter/map callbacks
(e.g., predicates like (p: any)) with the concrete UIMessage type and update
assertions to use the typed properties (e.g., assistantMessage.parts,
firstStep.content, firstStep.signature) so TypeScript catches shape/regression
changes; ensure imports follow the project's import style to address
sort-imports and import/consistent-type-specifier-style hints.

115-122: afterEach is outside the describe block.

The test.afterEach hook at line 115 is declared after the describe closes on line 113, so it attaches to the root file scope rather than the Chat E2E Tests - Multi-Step Thinking (Mock API) group. That's functionally fine today (only one describe block in this file), but if additional describes are added later the screenshot hook will run for all of them, which may or may not be intended. Consider moving it inside the describe for clearer scoping.

Suggested change
 test.describe('Chat E2E Tests - Multi-Step Thinking (Mock API)', () => {
   ...
+
+  test.afterEach(async ({ page }, testInfo) => {
+    if (testInfo.status !== testInfo.expectedStatus) {
+      await page.screenshot({
+        path: `test-results/failure-${testInfo.title.replace(/\s+/g, '-')}.png`,
+        fullPage: true,
+      })
+    }
+  })
 })
-
-test.afterEach(async ({ page }, testInfo) => {
-  if (testInfo.status !== testInfo.expectedStatus) {
-    await page.screenshot({
-      path: `test-results/failure-${testInfo.title.replace(/\s+/g, '-')}.png`,
-      fullPage: true,
-    })
-  }
-})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/smoke-tests/e2e/tests/thinking.spec.ts` around lines 115
- 122, The screenshot cleanup hook test.afterEach is currently declared outside
the describe "Chat E2E Tests - Multi-Step Thinking (Mock API)" so it attaches to
the root scope; move the test.afterEach block inside that describe (before its
closing brace) so the hook is scoped to that describe only, preserving the same
screenshot logic and parameters (page, testInfo) and keeping the filename
generation code unchanged.
packages/typescript/smoke-tests/e2e/src/routes/api.mock-chat.ts (1)

310-394: Two STEP_FINISHED events per step — intentional but worth a comment.

Each thinking step emits STEP_FINISHED twice (first with a delta, then with content + signature). This appears to intentionally simulate streaming deltas followed by a final summary event, but STEP_FINISHED conventionally fires once per step. Since the PR's tests pass and rely on both events being processed (final signature stored on the step), consider adding a brief inline comment documenting that the first STEP_FINISHED carries the incremental delta and the second carries the final content/signature, so future readers don't mistake it for a duplicate event bug.

Also: the Date.now() calls in the chunk literals are evaluated at module load time; createMockStream already overrides timestamp on yield (line 407), so these initial values are dead — not a defect, just noise. Consider using a placeholder like timestamp: 0 for the static chunks to make that clearer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/smoke-tests/e2e/src/routes/api.mock-chat.ts` around lines
310 - 394, Add a short inline comment above the "thinking-multi-step" mock
describing that each thinking step intentionally emits two STEP_FINISHED events
(first carrying the streaming delta, second carrying final content+signature) so
readers don't treat it as a duplicate; also replace the Date.now() timestamps in
the chunk literals with a placeholder like 0 since createMockStream overrides
timestamps at yield (see "thinking-multi-step", STEP_FINISHED entries and
createMockStream).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/typescript/ai-anthropic/src/adapters/text.ts`:
- Around line 407-417: The code drops signed thinking for assistant messages
that have no toolCalls; ensure any message.thinking items with a signature are
always appended to contentBlocks as AnthropicContentBlock regardless of the
assistant/toolCalls branch. Update the logic around the handling of
message.thinking (the loop that pushes {type:'thinking', thinking:
thinking.content, signature: thinking.signature} into contentBlocks) so it runs
for both the assistant+toolCalls branch and the later assistant-only branch (the
code paths around the existing message.thinking loop and the block at lines
~468-478); you can either hoist the loop above the branch so it always executes
for message.thinking or add the same signed-thinking push into the
assistant-only branch, referencing message.thinking, contentBlocks, and
AnthropicContentBlock.

In `@packages/typescript/ai/src/activities/chat/stream/processor.ts`:
- Around line 1119-1136: The code currently prefers state.currentThinkingStepId
over chunk.stepId which causes distinct chunk.stepId values to be merged into an
existing thinking step; change the selection to prefer the explicit chunk.stepId
when present: compute stepId = chunk.stepId ?? state.currentThinkingStepId, then
if the resolved stepId is not present in state.thinkingSteps, initialize it
(state.thinkingSteps.set(stepId, ''), state.thinkingStepOrder.push(stepId)) and
update state.currentThinkingStepId = stepId; keep the existing
pendingThinkingStepId consumption logic but ensure it does not override an
explicit chunk.stepId.

---

Nitpick comments:
In `@packages/typescript/smoke-tests/e2e/src/routes/api.mock-chat.ts`:
- Around line 310-394: Add a short inline comment above the
"thinking-multi-step" mock describing that each thinking step intentionally
emits two STEP_FINISHED events (first carrying the streaming delta, second
carrying final content+signature) so readers don't treat it as a duplicate; also
replace the Date.now() timestamps in the chunk literals with a placeholder like
0 since createMockStream overrides timestamps at yield (see
"thinking-multi-step", STEP_FINISHED entries and createMockStream).

In `@packages/typescript/smoke-tests/e2e/tests/thinking.spec.ts`:
- Around line 48-51: Tighten the types by importing UIMessage from
"@tanstack/ai-client" and update getMessages to return Promise<UIMessage[]>
(change the function signature and JSON parse typing), then replace loose any
usages in filter/map callbacks (e.g., predicates like (p: any)) with the
concrete UIMessage type and update assertions to use the typed properties (e.g.,
assistantMessage.parts, firstStep.content, firstStep.signature) so TypeScript
catches shape/regression changes; ensure imports follow the project's import
style to address sort-imports and import/consistent-type-specifier-style hints.
- Around line 115-122: The screenshot cleanup hook test.afterEach is currently
declared outside the describe "Chat E2E Tests - Multi-Step Thinking (Mock API)"
so it attaches to the root scope; move the test.afterEach block inside that
describe (before its closing brace) so the hook is scoped to that describe only,
preserving the same screenshot logic and parameters (page, testInfo) and keeping
the filename generation code unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 52fe2bf7-eb5c-4660-a208-3e1292bede10

📥 Commits

Reviewing files that changed from the base of the PR and between c042c55 and c31d45d.

📒 Files selected for processing (7)
  • .changeset/thinking-blocks-per-step.md
  • packages/typescript/ai-anthropic/src/adapters/text.ts
  • packages/typescript/ai/src/activities/chat/stream/processor.ts
  • packages/typescript/ai/tests/stream-processor.test.ts
  • packages/typescript/smoke-tests/e2e/src/routes/api.mock-chat.ts
  • packages/typescript/smoke-tests/e2e/src/routes/mock.tsx
  • packages/typescript/smoke-tests/e2e/tests/thinking.spec.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/thinking-blocks-per-step.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/typescript/ai/tests/stream-processor.test.ts

Comment on lines +407 to +417
if (message.thinking?.length) {
for (const thinking of message.thinking) {
if (thinking.signature) {
contentBlocks.push({
type: 'thinking',
thinking: thinking.content,
signature: thinking.signature,
} as unknown as AnthropicContentBlock)
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replay signed thinking for non-tool assistant messages too.

Line 407 only preserves thinking inside the assistant && toolCalls branch. Assistant responses that contain signed thinking but no tool calls fall through to lines 468-478, so their message.thinking is dropped from future Anthropic context.

🐛 Proposed fix
+    const appendThinkingBlocks = (
+      contentBlocks: AnthropicContentBlocks,
+      thinkingParts: ModelMessage['thinking'],
+    ) => {
+      if (!thinkingParts?.length) return
+
+      for (const thinking of thinkingParts) {
+        if (thinking.signature) {
+          contentBlocks.push({
+            type: 'thinking',
+            thinking: thinking.content,
+            signature: thinking.signature,
+          } as unknown as AnthropicContentBlock)
+        }
+      }
+    }
+
     for (const message of messages) {
       const role = message.role
@@
       if (role === 'assistant' && message.toolCalls?.length) {
         const contentBlocks: AnthropicContentBlocks = []
 
-        if (message.thinking?.length) {
-          for (const thinking of message.thinking) {
-            if (thinking.signature) {
-              contentBlocks.push({
-                type: 'thinking',
-                thinking: thinking.content,
-                signature: thinking.signature,
-              } as unknown as AnthropicContentBlock)
-            }
-          }
-        }
+        appendThinkingBlocks(contentBlocks, message.thinking)
@@
+      if (role === 'assistant' && message.thinking?.length) {
+        const contentBlocks: AnthropicContentBlocks = []
+        appendThinkingBlocks(contentBlocks, message.thinking)
+
+        if (message.content) {
+          contentBlocks.push({
+            type: 'text',
+            text: typeof message.content === 'string' ? message.content : '',
+          } as AnthropicContentBlock)
+        }
+
+        formattedMessages.push({
+          role: 'assistant',
+          content: contentBlocks,
+        })
+        continue
+      }
+
       formattedMessages.push({

Also applies to: 468-478

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai-anthropic/src/adapters/text.ts` around lines 407 -
417, The code drops signed thinking for assistant messages that have no
toolCalls; ensure any message.thinking items with a signature are always
appended to contentBlocks as AnthropicContentBlock regardless of the
assistant/toolCalls branch. Update the logic around the handling of
message.thinking (the loop that pushes {type:'thinking', thinking:
thinking.content, signature: thinking.signature} into contentBlocks) so it runs
for both the assistant+toolCalls branch and the later assistant-only branch (the
code paths around the existing message.thinking loop and the block at lines
~468-478); you can either hoist the loop above the branch so it always executes
for message.thinking or add the same signed-thinking push into the
assistant-only branch, referencing message.thinking, contentBlocks, and
AnthropicContentBlock.

Comment on lines +1119 to +1136
// Consume pending stepId from STEP_STARTED that arrived before the message existed
if (this.pendingThinkingStepId) {
state.currentThinkingStepId = this.pendingThinkingStepId
if (!state.thinkingSteps.has(this.pendingThinkingStepId)) {
state.thinkingSteps.set(this.pendingThinkingStepId, '')
state.thinkingStepOrder.push(this.pendingThinkingStepId)
}
this.pendingThinkingStepId = null
}

const stepId = state.currentThinkingStepId ?? chunk.stepId

// Auto-initialize if no prior STEP_STARTED (backward compat)
if (!state.thinkingSteps.has(stepId)) {
state.thinkingSteps.set(stepId, '')
state.thinkingStepOrder.push(stepId)
state.currentThinkingStepId = stepId
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prefer the chunk’s explicit stepId to avoid merging distinct thinking steps.

Line 1129 ignores chunk.stepId whenever currentThinkingStepId is already set. If a replay/provider emits STEP_FINISHED events with distinct stepIds but without STEP_STARTED, later steps are accumulated into the previous ThinkingPart.

🐛 Proposed fix
-    const stepId = state.currentThinkingStepId ?? chunk.stepId
+    const stepId = chunk.stepId ?? state.currentThinkingStepId
 
     // Auto-initialize if no prior STEP_STARTED (backward compat)
     if (!state.thinkingSteps.has(stepId)) {
       state.thinkingSteps.set(stepId, '')
       state.thinkingStepOrder.push(stepId)
-      state.currentThinkingStepId = stepId
     }
+    state.currentThinkingStepId = stepId
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai/src/activities/chat/stream/processor.ts` around lines
1119 - 1136, The code currently prefers state.currentThinkingStepId over
chunk.stepId which causes distinct chunk.stepId values to be merged into an
existing thinking step; change the selection to prefer the explicit chunk.stepId
when present: compute stepId = chunk.stepId ?? state.currentThinkingStepId, then
if the resolved stepId is not present in state.thinkingSteps, initialize it
(state.thinkingSteps.set(stepId, ''), state.thinkingStepOrder.push(stepId)) and
update state.currentThinkingStepId = stepId; keep the existing
pendingThinkingStepId consumption logic but ensure it does not override an
explicit chunk.stepId.

imsherrill and others added 9 commits May 5, 2026 20:55
- Track thinking per-step via stepId instead of merging into single ThinkingPart
- Capture Anthropic signature_delta and preserve through the full stack
- Server-side TextEngine accumulates thinking + signatures per iteration
- Include thinking blocks in Anthropic message history for multi-turn context
- Add interleaved-thinking-2025-05-14 beta header when thinking is enabled
- Add tests for multi-step thinking, backward compat, and result aggregation

Closes TanStack#340
…int; clear stale pendingThinkingStepId

- Move `betas: ['interleaved-thinking-2025-05-14']` out of the shared mapper
  and onto the `beta.messages.create` call site in chatStream. Prevents the
  non-beta structuredOutput endpoint from receiving an invalid `betas` field
  when a thinking budget is configured.
- Clear `pendingThinkingStepId` when a later STEP_STARTED takes the
  active-message branch, and also in `resetStreamState`, so a stale pending
  id can't misattribute a later STEP_FINISHED's delta to an earlier step.
- Add covering test for the pendingThinkingStepId leak (red-green verified).
- Add `thinking-multi-step` mock scenario to the e2e harness emitting
  STEP_STARTED/STEP_FINISHED pairs for two distinct stepIds with
  provider signatures, followed by a text message and RUN_FINISHED.
- Expose thinkingPartCount / thinkingStepIds on the mock chat page via
  data-* attributes for assertion.
- Add tests/thinking.spec.ts asserting two ThinkingParts with distinct
  stepIds and matching signatures are produced (pre-PR behavior merged
  them into a single part).
- Add .changeset/thinking-blocks-per-step.md bumping @tanstack/ai,
  @tanstack/ai-anthropic, and @tanstack/ai-client.
When STEP_STARTED arrives before the assistant message exists, its
stepId is stashed in pendingThinkingStepId. handleStepFinishedEvent
already consumes it, but handleReasoningMessageContentEvent did not,
so reasoning deltas were keyed by the reasoning messageId and the
matching signature from STEP_FINISHED landed on a different
ThinkingPart. With Anthropic's interleaved thinking around tool calls
this produced two ThinkingParts per block (one unsigned content, one
signed empty). Consume the pending stepId here too so both event
paths attribute to the same step.
Both handleStepFinishedEvent and handleReasoningMessageContentEvent
need to promote a pending stepId from a STEP_STARTED that arrived
before the assistant message existed. Pull the shared logic into a
small private helper instead of duplicating it.
The bare yield bypassed the file's existing StreamChunk cast helper,
so after the merge from main StepFinishedEvent now extends the stricter
AG-UI base (requires stepName, no signature) and CI failed to build.
Use asChunk like every other yield in this generator and add stepName.
@imsherrill imsherrill force-pushed the fix/thinking-blocks-per-step branch from 962e396 to c53d501 Compare May 6, 2026 01:47
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.changeset/thinking-blocks-per-step.md:
- Around line 2-4: The changeset currently lists patch bumps for packages that
introduce a breaking API change: update the version bumps to minor for the
affected packages; specifically change the entries in
.changeset/thinking-blocks-per-step.md from patch to minor for '@tanstack/ai'
and '@tanstack/ai-client' (and any other package exposing the new
onThinkingUpdate signature), and ensure the changeset message mentions the
breaking change to the onThinkingUpdate callback (old signature (messageId,
content) -> new signature (messageId, stepId, content)) and references the
public API types StreamProcessorEvents and ChatClient so the release tooling
applies a minor bump.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ef6014b3-a4ed-43d1-bc2b-e065a9dec715

📥 Commits

Reviewing files that changed from the base of the PR and between c31d45d and c53d501.

📒 Files selected for processing (1)
  • .changeset/thinking-blocks-per-step.md

Comment thread .changeset/thinking-blocks-per-step.md Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/typescript/ai/src/activities/chat/tools/tool-calls.ts (1)

96-98: ⚡ Quick win

Add toolCallName?: string override to ToolCallStartEvent interface and remove the cast workaround

ToolCallEndEvent already declares both toolCallName?: string and toolName?: string as optional fields (lines 930–934). ToolCallStartEvent should follow the same pattern by explicitly overriding toolCallName to be optional, rather than using the Partial<> cast at the call site. This makes the optionality explicit in the type contract and aligns with how ToolCallEndEvent handles backward compatibility.

Suggested changes

In packages/typescript/ai/src/types.ts, update ToolCallStartEvent:

export interface ToolCallStartEvent extends AGUIToolCallStartEvent {
  /** Model identifier for multi-model support */
  model?: string
  /**
   * `@deprecated` Use `toolCallName` instead (from `@ag-ui/core` spec).
   * Kept for backward compatibility.
   */
  toolName: string
+ /** Name of the tool being called (from `@ag-ui/core`) */
+ toolCallName?: string
  /** Index for parallel tool calls */
  index?: number
  /** Provider-specific metadata to carry into the ToolCall */
  providerMetadata?: Record<string, unknown>
}

Then in packages/typescript/ai/src/activities/chat/tools/tool-calls.ts, simplify lines 96–98:

- const runtimeEvent = event as Partial<ToolCallStartEvent> &
-   Pick<ToolCallStartEvent, 'toolName'>
- const name = runtimeEvent.toolCallName ?? runtimeEvent.toolName
+ const name = event.toolCallName ?? event.toolName
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/typescript/ai/src/activities/chat/tools/tool-calls.ts` around lines
96 - 98, Update the ToolCallStartEvent interface to include an explicit optional
toolCallName?: string field (mirroring ToolCallEndEvent's toolCallName?: string
and toolName?: string) in the types definition (ToolCallStartEvent), then remove
the Partial<> cast workaround in the tool call handler and directly treat event
as ToolCallStartEvent so the code in tool-calls.ts can simply read const name =
event.toolCallName ?? event.toolName; ensure the change preserves backward
compatibility and compiles against existing uses of ToolCallStartEvent and
tool-calls.ts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/typescript/ai/src/activities/chat/tools/tool-calls.ts`:
- Around line 96-98: Update the ToolCallStartEvent interface to include an
explicit optional toolCallName?: string field (mirroring ToolCallEndEvent's
toolCallName?: string and toolName?: string) in the types definition
(ToolCallStartEvent), then remove the Partial<> cast workaround in the tool call
handler and directly treat event as ToolCallStartEvent so the code in
tool-calls.ts can simply read const name = event.toolCallName ?? event.toolName;
ensure the change preserves backward compatibility and compiles against existing
uses of ToolCallStartEvent and tool-calls.ts.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9bd6764e-e99e-4415-89ac-f8e667b04a7e

📥 Commits

Reviewing files that changed from the base of the PR and between f1df71e and 11249c3.

📒 Files selected for processing (1)
  • packages/typescript/ai/src/activities/chat/tools/tool-calls.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Anthropic tool loops: thinking blocks missing on turn 2+

2 participants