gracefully fall back when model chat template doesn't support tools#771
gracefully fall back when model chat template doesn't support tools#771ericcurtin wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces robust tool-calling capabilities and a graceful fallback mechanism for models that do not support tool-specific chat templates, such as those using Jinja. The changes include new data structures for OpenAI-compatible tool definitions and calls, an agentic loop in the ChatWithMessagesContext function to handle tool execution, and a retry mechanism that falls back to a non-tool-enabled chat if a Jinja template error is detected. Additionally, a WebSearchTool is implemented, demonstrating how external services can be integrated as tools. Comprehensive unit tests have been added to validate the fallback logic under various error conditions, ensuring the stability and reliability of the new features.
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="cmd/cli/tools/websearch.go" line_range="102-109" />
<code_context>
+ return "", fmt.Errorf("reading response: %w", err)
+ }
+
+ // Response may be SSE or plain JSON — handle both.
+ responseText := string(body)
+ for _, line := range strings.Split(responseText, "\n") {
+ if strings.HasPrefix(line, "data: ") {
+ responseText = strings.TrimPrefix(line, "data: ")
</code_context>
<issue_to_address>
**suggestion (bug_risk):** SSE handling only uses the first `data:` line, which can truncate multi-chunk responses.
This loop stops at the first `data:` line and discards any subsequent chunks, so multi-part SSE messages will be parsed using only the first fragment, causing malformed JSON or truncated content. Consider accumulating all `data:` lines into a single buffer (e.g., joined with `
`) before unmarshaling, or using a proper SSE parser for `text/event-stream` responses.
```suggestion
// Response may be SSE or plain JSON — handle both.
responseText := string(body)
// For SSE (`text/event-stream`), accumulate all `data:` lines into a single payload.
var dataLines []string
for _, line := range strings.Split(responseText, "\n") {
if strings.HasPrefix(line, "data: ") {
dataLines = append(dataLines, strings.TrimPrefix(line, "data: "))
}
}
if len(dataLines) > 0 {
responseText = strings.Join(dataLines, "\n")
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Models like gemma3 use Jinja chat templates that only support alternating
user/assistant roles and cannot handle the tool/assistant(tool_calls) message
roles introduced by the agentic web-search loop.
When a 500 response containing "Jinja" is detected, the client now resets the
message history to the original user prompt, clears the tool schemas, and
retries — allowing the model to answer from training data rather than erroring
out. This covers two failure modes:
1. The model rejects the request immediately because the tools parameter
causes its template to inject an incompatible system message.
2. The model successfully emits a tool_call but then fails on the follow-up
request because it can't process the resulting "tool" role message.
Adds three unit tests covering the fallback paths and the non-Jinja error case.
Signed-off-by: Eric Curtin <eric.curtin@docker.com>
9f29adf to
8aab0f4
Compare
Models like gemma3 use Jinja chat templates that only support alternating user/assistant roles and cannot handle the tool/assistant(tool_calls) message roles introduced by the agentic web-search loop.
When a 500 response containing "Jinja" is detected, the client now resets the message history to the original user prompt, clears the tool schemas, and retries — allowing the model to answer from training data rather than erroring out. This covers two failure modes:
Adds three unit tests covering the fallback paths and the non-Jinja error case.