Skip to content

core: cache the tool search handler per session#27258

Merged
mchen-oai merged 3 commits into
mainfrom
mchen/tool-search-handler-cache
Jun 15, 2026
Merged

core: cache the tool search handler per session#27258
mchen-oai merged 3 commits into
mainfrom
mchen/tool-search-handler-cache

Conversation

@mchen-oai

@mchen-oai mchen-oai commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Why

Tool router construction rebuilds the deferred-tool BM25 index during session initialization and before each sampling continuation, even when the searchable tool metadata is unchanged. Local profiling measured append_tool_search_executor at roughly 113 ms per continuation, making repeated index construction the largest measured router-building cost.

What changed

  • Add a session-scoped ToolSearchHandlerCache so continuations and user turns can reuse the existing handler.
  • Key reuse on the complete ordered Vec<ToolSearchInfo>, rebuilding when searchable text, loadable tool specs, source metadata, or ordering changes.
  • Build handlers outside the cache lock and recheck before publishing them, avoiding holding the mutex during index construction.

Verification

  • cache_reuses_identical_search_infos_and_rebuilds_changed_inputs covers exact cache reuse and invalidation when the ordered search metadata changes.
  • Local rollout profiling showed the initial router build populating the cache and unchanged later continuations reusing it:
    • uncached: 118 ms median across 14 spans from 3 rollouts
    • cached: 4 ms median across 12 spans from 3 rollouts

@mchen-oai mchen-oai force-pushed the mchen/tool-search-handler-cache branch 6 times, most recently from cc03149 to 46f0884 Compare June 10, 2026 00:48
@mchen-oai mchen-oai changed the title Cache tool search handler across sampling continuations Cache tool search handler within session Jun 10, 2026
@mchen-oai mchen-oai marked this pull request as ready for review June 10, 2026 15:46
@mchen-oai mchen-oai requested a review from a team as a code owner June 10, 2026 15:46
@mchen-oai mchen-oai requested a review from owenlin0 June 10, 2026 15:46
@mchen-oai mchen-oai requested review from jif-oai and pakrym-oai June 10, 2026 15:48
@mchen-oai mchen-oai assigned pakrym-oai and jif-oai and unassigned owenlin0 Jun 10, 2026
@mchen-oai mchen-oai changed the title Cache tool search handler within session Cache tool search handler within sessions Jun 10, 2026
@jif-oai jif-oai changed the title Cache tool search handler within sessions core: cache the tool search handler per session Jun 10, 2026
@mchen-oai mchen-oai force-pushed the mchen/tool-search-handler-cache branch 2 times, most recently from b8f1cab to 8b7acd6 Compare June 11, 2026 14:17
}
}

let handler = Arc::new(ToolSearchHandler::new(search_infos.clone()));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a fingerprint instead of storing a whole copy of the original search_infos?

@mchen-oai mchen-oai Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzeng-openai - To my understanding, you will drive broader tool caching (Slack). The broader approach may eliminate the need for this PR, unless it caches at multiple levels. What do you think though and what is your timeline? Are you thinking I proceed with this PR for initial improvement?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do some research and let you know. In the meanwhile this seems like a viable interim solution to unblock the launch since basically half of the latency comes from BM25 index. But would love to get thoughts from core agent folks

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this so ToolSearchHandler owns the original ordered search_infos, and the cache compares directly against the handler's actual inputs. This removes the search_infos copy and avoids a separate derived cache key. Still open to the fingerprint approach if others feel strongly though.

@mchen-oai mchen-oai force-pushed the mchen/tool-search-handler-cache branch 2 times, most recently from 16a6ece to 95dbc37 Compare June 12, 2026 15:24
@mchen-oai mchen-oai requested a review from mzeng-openai June 12, 2026 19:18
@mchen-oai mchen-oai force-pushed the mchen/tool-search-handler-cache branch 2 times, most recently from 4cd9e95 to c84f5f5 Compare June 15, 2026 14:54
@mchen-oai mchen-oai force-pushed the mchen/tool-search-handler-cache branch from c84f5f5 to 3e503cc Compare June 15, 2026 17:47
@mchen-oai mchen-oai merged commit af99f6a into main Jun 15, 2026
31 checks passed
@mchen-oai mchen-oai deleted the mchen/tool-search-handler-cache branch June 15, 2026 21:48
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 15, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants