Skip to content

fix: balance callback lifecycle for hallucinated tool calls#4808

Open
giulio-leone wants to merge 1 commit intogoogle:mainfrom
giulio-leone:fix/hallucinated-tool-callback-lifecycle
Open

fix: balance callback lifecycle for hallucinated tool calls#4808
giulio-leone wants to merge 1 commit intogoogle:mainfrom
giulio-leone:fix/hallucinated-tool-callback-lifecycle

Conversation

@giulio-leone
Copy link

Summary

Fixes #4775

When an LLM hallucinates a tool name that doesn't exist in tools_dict, _get_tool() raises ValueError. Previously, on_tool_error_callback fired immediately — before before_tool_callback and outside the OpenTelemetry tracer span. This caused plugins that push/pop spans (e.g. BigQueryAgentAnalyticsPlugin's TraceManager) to pop the parent agent span, corrupting the trace stack for all subsequent tool calls in the session.

Root Cause

The callback lifecycle contract is:

before_tool_callback → (tool execution OR on_tool_error_callback) → after_tool_callback

For hallucinated tools, the old code path was:

on_tool_error_callback → (return or raise)  # before_tool_callback never called!

This violated the lifecycle invariant — plugins that push a span in before_tool_callback never get to push, but on_tool_error_callback still pops, corrupting the stack.

Fix

Move the ValueError handling inside _run_with_trace() so that:

  1. before_tool_callback always fires first (balanced push)
  2. The error is surfaced within the OTel span context
  3. on_tool_error_callback fires after before_tool_callback

Applied to both handle_function_calls_async and handle_function_calls_live code paths.

Testing

  • 2 new tests in test_plugin_tool_callbacks.py:
    • test_hallucinated_tool_fires_before_and_error_callbacks: Verifies callback order (before → error)
    • test_hallucinated_tool_raises_when_no_error_callback: Verifies ValueError propagates correctly
  • All 12 callback tests pass
  • Full unit test suite: 4727 passed, 0 regressions

When an LLM hallucinates a tool name, _get_tool() raises ValueError.
Previously, on_tool_error_callback fired immediately — before
before_tool_callback and outside the OTel tracer span.  This caused
plugins that push/pop spans (e.g. BigQueryAgentAnalyticsPlugin's
TraceManager) to pop the parent agent span, corrupting the trace
stack for all subsequent tool calls.

Move the ValueError handling inside _run_with_trace() so that:
1. before_tool_callback always fires first (balanced push)
2. The error is surfaced within the OTel span context
3. on_tool_error_callback fires after before_tool_callback

Applied to both handle_function_calls_async and
handle_function_calls_live code paths.

Fixes google#4775

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@google-cla
Copy link

google-cla bot commented Mar 13, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@adk-bot adk-bot added the tracing [Component] This issue is related to OpenTelemetry tracing label Mar 13, 2026
@adk-bot
Copy link
Collaborator

adk-bot commented Mar 13, 2026

Response from ADK Triaging Agent

Hello @giulio-leone, thank you for your contribution!

It looks like the Contributor License Agreement (CLA) check is failing. Before we can merge your PR, you'll need to sign the CLA. You can find more information and sign the agreement at https://cla.developers.google.com/.

Thanks!

@rohityan
Copy link
Collaborator

Hi @giulio-leone , Thank you for your contribution! It appears you haven't yet signed the Contributor License Agreement (CLA). Please visit https://cla.developers.google.com/ to complete the signing process. Once the CLA is signed, we'll be able to proceed with the review of your PR. Thank you!

@rohityan rohityan self-assigned this Mar 13, 2026
@rohityan rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

request clarification [Status] The maintainer need clarification or more information from the author tracing [Component] This issue is related to OpenTelemetry tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unbalanced tool lifecycle callbacks for hallucinated tools cause TraceManager stack corruption in plugins

3 participants