-
Notifications
You must be signed in to change notification settings - Fork 83
feat: use dynamic token limits from listAvailableModels API #2539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2539 +/- ##
==========================================
+ Coverage 60.09% 60.16% +0.07%
==========================================
Files 276 277 +1
Lines 65009 65135 +126
Branches 4105 4112 +7
==========================================
+ Hits 39068 39190 +122
- Misses 25858 25862 +4
Partials 83 83
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| const selectedModel = models.find(model => model.id === selectedModelId) | ||
| const maxInputTokens = TokenLimitsCalculator.extractMaxInputTokens(selectedModel) | ||
| const tokenLimits = TokenLimitsCalculator.calculate(maxInputTokens) | ||
| session.setTokenLimits(tokenLimits) | ||
| this.#log( | ||
| `Token limits calculated for initial model selection (${selectedModelId}): ${JSON.stringify(tokenLimits)}` | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like it would be cleaner to modify the session to encapsulate more details about the model as a single entity (i.e. model id and token limits) than to add another parameter that callers will need to remember to set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, will address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Problem
We use hardcoded constants for LLM token/character limits.
These static values do not adapt to different model capabilities so models with larger context windows can't utilize their full capacity (Sonnet 4.5 1M), and models with smaller windows may exceed their limits.
Additionally, when users switch models mid-session, the token limits remain unchanged from the initial model selection.
Solution
Dynamically calculate character limits based on
maxInputTokensfrom the listAvailableModels API responseToken limits are now stored per-session and recalculated when:
Updated compaction to use dynamic limits from the session
Removed deprecated static constants
Added tokenLimits to fallback models
Updated dependencies
Calculations:
License
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.