Skip to content

CompactionProcessor never triggers for Claude models — tiktoken underestimates by 40%+ #975

@pmella

Description

@pmella

Environment

  • @github/copilot-sdk: 0.1.32
  • @github/copilot (CLI): 1.0.6

Description

The CompactionProcessor in the Copilot CLI (app.js) estimates context utilization using tiktoken with a model-specific correction factor (MEs map: claude-opus-4.5: 1.15). This factor is far too low — in production testing, the SDK's estimated token count remained below the 80% background compaction threshold while actual Anthropic API input tokens exceeded 200k and triggered a hard 400 invalid_request_error: prompt is too long.

Evidence from a live session

  • At 150k actual Anthropic input_tokens, the CompactionProcessor had not triggered
  • At 200k actual tokens (hard API limit), zero compaction events had been emitted
  • The session.compaction_start event was never fired across 9+ queries
  • When we manually called session.rpc.compaction.compact() at 150k real tokens, the SDK reported conversationTokens: 160,471 internally — showing the SDK's own estimate was closer to reality than expected, but its threshold still wasn't reached

Impact

Sessions using Claude models become permanently stuck once context exceeds 200k tokens. Subsequent queries add to the context rather than reducing it, making recovery impossible without starting a new conversation.

Expected behavior

CompactionProcessor should trigger background compaction before hitting the model's context limit, regardless of provider.

Suggested fix options

  1. Use Anthropic's /v1/messages/count_tokens endpoint for accurate Claude token counting
  2. Increase the correction factor from 1.15 to at least 1.40 — overestimation is safe (triggers compaction earlier), underestimation is dangerous (never triggers)
  3. Use the actual input_tokens from Anthropic API usage responses to calibrate the estimate at runtime

Current workaround

We call session.rpc.compaction.compact() manually after each query when real input_tokens (from assistant.usage events) exceed 120k tokens (60% of Claude's 200k limit).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions