Skip to content

feat: enabling output_schema and tools to coexist#1062

Open
copybara-service[bot] wants to merge 1 commit intomainfrom
test_886133734
Open

feat: enabling output_schema and tools to coexist#1062
copybara-service[bot] wants to merge 1 commit intomainfrom
test_886133734

Conversation

@copybara-service
Copy link

feat: enabling output_schema and tools to coexist

This CL enables the simultaneous use of output_schema (structured output) and tools for models that do not natively support both features at once (specifically Gemini 1.x and 2.x on Vertex AI).

Core Logic

The CL implements a workaround for models with this limitation:

  1. Synthetic Tooling: Instead of passing the output_schema directly to the model's configuration, it introduces a synthetic tool called set_model_response.
  2. Schema Injection: The parameters of this tool are set to the requested output_schema.
  3. Instruction Prompting: System instructions are appended, directing the model to provide its final response using this specific tool in the required format.
  4. Response Interception: The BaseLlmFlow is updated to check if set_model_response was called. If so, it extracts the JSON arguments and converts them into a standard model response event.

Key Changes

  • OutputSchema.java (New): A new RequestProcessor that detects when the workaround is needed, adds the SetModelResponseTool, and provides utilities for extracting the structured response.
  • SetModelResponseTool.java (New): A marker tool that simply returns its input arguments, used to "capture" the structured output from the model.
  • ModelNameUtils.java: Added logic to identify Gemini 1.x and 2.x models and determine if they can handle native output_schema alongside tools.
  • BaseLlmFlow.java: Updated the flow logic to detect the synthetic tool response and generate the final output event.
  • Basic.java: Updated to prevent native outputSchema configuration when the workaround is active.
  • SingleFlow.java: Registered the new OutputSchema processor.

This CL enables the simultaneous use of `output_schema` (structured output) and `tools` for models that do not natively support both features at once (specifically Gemini 1.x and 2.x on Vertex AI).

### Core Logic
The CL implements a workaround for models with this limitation:
1.  **Synthetic Tooling**: Instead of passing the `output_schema` directly to the model's configuration, it introduces a synthetic tool called `set_model_response`.
2.  **Schema Injection**: The parameters of this tool are set to the requested `output_schema`.
3.  **Instruction Prompting**: System instructions are appended, directing the model to provide its final response using this specific tool in the required format.
4.  **Response Interception**: The `BaseLlmFlow` is updated to check if `set_model_response` was called. If so, it extracts the JSON arguments and converts them into a standard model response event.

### Key Changes
*   **`OutputSchema.java` (New)**: A new `RequestProcessor` that detects when the workaround is needed, adds the `SetModelResponseTool`, and provides utilities for extracting the structured response.
*   **`SetModelResponseTool.java` (New)**: A marker tool that simply returns its input arguments, used to "capture" the structured output from the model.
*   **`ModelNameUtils.java`**: Added logic to identify Gemini 1.x and 2.x models and determine if they can handle native `output_schema` alongside tools.
*   **`BaseLlmFlow.java`**: Updated the flow logic to detect the synthetic tool response and generate the final output event.
*   **`Basic.java`**: Updated to prevent native `outputSchema` configuration when the workaround is active.
*   **`SingleFlow.java`**: Registered the new `OutputSchema` processor.

PiperOrigin-RevId: 886133734
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant