Skip to content

Conversation

@chungjac
Copy link
Contributor

@chungjac chungjac commented Dec 2, 2025

Problem

We use hardcoded constants for LLM token/character limits.

These static values do not adapt to different model capabilities so models with larger context windows can't utilize their full capacity (Sonnet 4.5 1M), and models with smaller windows may exceed their limits.

Additionally, when users switch models mid-session, the token limits remain unchanged from the initial model selection.

Solution

Dynamically calculate character limits based on maxInputTokens from the listAvailableModels API response

Token limits are now stored per-session and recalculated when:

  • Models are initially loaded
  • User switches models

Updated compaction to use dynamic limits from the session

Removed deprecated static constants

Added tokenLimits to fallback models

Updated dependencies

Calculations:

  • maxOverallCharacters = maxInputTokens * 3.5
  • inputLimit = 0.7 * maxOverallCharacters
  • compactionThreshold = 0.7 * maxOverallCharacters (leaving as separate variable from inputLimit, maybe we want to tweak these thresholds in the future)
  • Default fallback: 200,000 tokens if API doesn't return tokenLimits

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@chungjac chungjac requested a review from a team as a code owner December 2, 2025 20:14
@codecov-commenter
Copy link

codecov-commenter commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 93.75000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.16%. Comparing base (d8c5c16) to head (29b84f0).

Files with missing lines Patch % Lines
...nguage-server/agenticChat/agenticChatController.ts 82.45% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2539      +/-   ##
==========================================
+ Coverage   60.09%   60.16%   +0.07%     
==========================================
  Files         276      277       +1     
  Lines       65009    65135     +126     
  Branches     4105     4112       +7     
==========================================
+ Hits        39068    39190     +122     
- Misses      25858    25862       +4     
  Partials       83       83              
Flag Coverage Δ
unittests 60.16% <93.75%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines 835 to 841
const selectedModel = models.find(model => model.id === selectedModelId)
const maxInputTokens = TokenLimitsCalculator.extractMaxInputTokens(selectedModel)
const tokenLimits = TokenLimitsCalculator.calculate(maxInputTokens)
session.setTokenLimits(tokenLimits)
this.#log(
`Token limits calculated for initial model selection (${selectedModelId}): ${JSON.stringify(tokenLimits)}`
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like it would be cleaner to modify the session to encapsulate more details about the model as a single entity (i.e. model id and token limits) than to add another parameter that callers will need to remember to set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, will address

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@chungjac chungjac merged commit f87ac9f into aws:main Dec 5, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants