Fast Autonomous Mode #5888

DOsinga · 2025-11-26T14:18:13Z

DOsinga
Nov 26, 2025
Maintainer

Overview

Add a fast-path execution mode that races a lightweight model against the full agent for simple queries in autonomous mode. When the fast model can handle a request, users get near-instant responses without waiting for the full agent machinery.

Motivation

Many autonomous requests are simple: ls, cat file.txt, "hello", etc.
Current flow always uses full context, all tools, and the primary model
Fast models (gpt-4o-mini, haiku) are 10-20x cheaper and faster
Users wait unnecessarily for complex infrastructure when simple tasks could complete instantly

Proposed Behavior

When Fast Path Activates

New goose mode that is autonomous++, i.e. opt-in and fully autonomous
Runs in parallel with regular agent on every turn
Uses complete_fast() provider method (already exists)

Fast Path Rules

Minimal prompt: "You're an assistant with shell command access. Handle simple requests or reply <<PASS>>"
Stripped conversation: Replace tool call details with <tool calls omitted> placeholders
Limited tools: Only developer extension (shell commands)
Budget: Max 5 turns of tool calling loop
Three outcomes:
- Text response → emit and exit
- Tool calls → execute and continue loop
- <<PASS>> → hand off to full agent

Racing Logic

tokio::select! {
    fast_result = fast_path_future => {
        if Some(messages) = fast_result {
            // Fast won - emit messages and exit
        }
        // Fast passed - continue to slow path
    }
    first_token = slow_stream.next() => {
        // Slow won - cancel fast and continue streaming
    }
}

Cost/Performance Profile

If the fast model wins, we can cancel the slow request. We would still be charged for the input tokens, but not the output tokens, which are typically typically 5x more expensive. If the fast model loses, we are charged for the input tokens. The input tokens for a fast model are 25x cheaper than the output tokens of a slow model. With the reduced number of input tokens we should be able to make interactions with goose both faster and cheaper.

michaelneale · 2025-11-27T01:31:22Z

michaelneale
Nov 27, 2025
Maintainer

could even do it without official tool calling (ie pull out shell calls) which I have found works very quickly for small models (obviously not great for large contexts), but not having tool call schema/output requirements can speed things up more

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fast Autonomous Mode #5888

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Fast Autonomous Mode #5888

Uh oh!

DOsinga Nov 26, 2025 Maintainer

Overview

Motivation

Proposed Behavior

When Fast Path Activates

Fast Path Rules

Racing Logic

Cost/Performance Profile

Replies: 1 comment

Uh oh!

michaelneale Nov 27, 2025 Maintainer

DOsinga
Nov 26, 2025
Maintainer

michaelneale
Nov 27, 2025
Maintainer