Skip to content

Add token limit fields to ProviderConfig across all SDKs#966

Open
MackinnonBuck wants to merge 5 commits intomainfrom
mackinnonbuck/byok-provider-token-limits
Open

Add token limit fields to ProviderConfig across all SDKs#966
MackinnonBuck wants to merge 5 commits intomainfrom
mackinnonbuck/byok-provider-token-limits

Conversation

@MackinnonBuck
Copy link
Copy Markdown
Collaborator

Summary

Adds four optional token-limit fields to ProviderConfig across all four language SDKs, enabling BYOK users to configure token limits for custom providers. These fields match the runtime's ProviderConfig type updated in PR #5311.

New fields

Wire name Node.js Python .NET Go
maxOutputTokens maxOutputTokens?: number max_output_tokens: int int? MaxOutputTokens MaxOutputTokens int
maxPromptTokens maxPromptTokens?: number max_prompt_tokens: int int? MaxPromptTokens MaxPromptTokens int
maxContextWindowTokens maxContextWindowTokens?: number max_context_window_tokens: int int? MaxContextWindowTokens MaxContextWindowTokens int
modelLimitsId modelLimitsId?: string model_limits_id: str string? ModelLimitsId ModelLimitsId string

All fields are optional. The Max* fields override the default limits resolved from the model's capability catalog entry. modelLimitsId specifies an alternate model ID for capability catalog lookup (defaults to the session's configured model ID when unset) — useful for fine-tuned models that share limits with a base model.

Changes

  • Node.js (nodejs/src/types.ts) — Added fields to ProviderConfig interface
  • Python (python/copilot/session.py) — Added fields to ProviderConfig TypedDict
  • Python (python/copilot/client.py) — Updated _convert_provider_to_wire_format to map new snake_case fields to camelCase
  • .NET (dotnet/src/Types.cs) — Added nullable properties with [JsonPropertyName] attributes
  • Go (go/types.go) — Added fields with json:"...,omitempty" tags

Testing

All four SDKs build successfully. No generated RPC types needed updating — ProviderConfig is defined directly in each SDK.

Add maxOutputTokens, maxPromptTokens, maxContextWindowTokens, and
modelLimitsId to ProviderConfig in Node.js, Python, .NET, and Go SDKs.
These optional fields allow BYOK users to configure token limits for
custom providers, matching the runtime's ProviderConfig (PR #5311).

Also update the Python wire format conversion to map the new snake_case
fields to camelCase for the JSON-RPC wire protocol.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 31, 2026 17:48
@MackinnonBuck MackinnonBuck requested a review from a team as a code owner March 31, 2026 17:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds four optional token-limit configuration fields to ProviderConfig across the Node.js, Python, Go, and .NET SDKs so BYOK/custom-provider users can override model token limits and/or specify an alternate model ID for capability-catalog limit lookup.

Changes:

  • Added maxOutputTokens, maxPromptTokens, maxContextWindowTokens, and modelLimitsId to ProviderConfig in Node.js, Go, and .NET.
  • Added corresponding snake_case fields to Python ProviderConfig and mapped them to camelCase wire keys in the Python client.
  • Documented the behavior/precedence of these fields via inline comments/XML docs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
nodejs/src/types.ts Extends TS ProviderConfig interface with optional token limit fields and modelLimitsId.
python/copilot/session.py Extends Python ProviderConfig TypedDict with optional token-limit and model-limit lookup fields.
python/copilot/client.py Maps new snake_case Python fields onto the expected camelCase wire names.
go/types.go Extends Go ProviderConfig struct with new JSON fields (omitempty).
dotnet/src/Types.cs Extends .NET ProviderConfig with nullable properties and JSON property names.

Fixes Go naming to follow initialism convention (ID not Id), consistent
with existing fields like APIKey, BaseURL, and APIVersion.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

✅ Cross-SDK Consistency Review

This PR maintains excellent consistency across all four SDK implementations. I've verified the following:

✅ Feature Parity

All four SDKs (Node.js/TypeScript, Python, Go, and .NET) now include the same four fields:

  • maxOutputTokens / max_output_tokens / MaxOutputTokens
  • maxPromptTokens / max_prompt_tokens / MaxPromptTokens
  • maxContextWindowTokens / max_context_window_tokens / MaxContextWindowTokens
  • modelLimitsId / model_limits_id / ModelLimitsID

✅ Language Idiom Compliance

Each SDK follows appropriate naming conventions:

  • TypeScript: camelCase (e.g., maxOutputTokens)
  • Python: snake_case (e.g., max_output_tokens) with correct wire format conversion to camelCase
  • Go: PascalCase with proper initialism handling (e.g., ModelLimitsID not ModelLimitsId)
  • .NET: PascalCase (e.g., MaxOutputTokens)

✅ Wire Format Consistency

All SDKs serialize to the same JSON-RPC wire format using camelCase keys:

  • Node.js/TypeScript: Direct mapping (already camelCase)
  • Python: Explicit conversion in _convert_provider_to_wire_format()
  • Go: JSON tags (json:"maxOutputTokens,omitempty")
  • .NET: [JsonPropertyName("maxOutputTokens")] attributes

✅ Documentation Consistency

All implementations include consistent inline documentation explaining:

  1. The override behavior (takes precedence over capability catalog defaults)
  2. Optional nature of all fields
  3. modelLimitsId use case (fine-tuned models sharing base model limits)

✅ Optionality

All fields correctly marked as optional:

  • TypeScript: ? modifier
  • Python: TypedDict with total=False
  • Go: omitempty tags (zero values omitted)
  • .NET: Nullable types (int?, string?)

No consistency issues found. This PR successfully maintains API parity across all language implementations. 🎉

Generated by SDK Consistency Review Agent for issue #966 ·

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

✅ Cross-SDK Consistency Review

This PR successfully maintains feature parity across all four SDK implementations. Excellent work!

Consistency Check Results

✅ All SDKs implement the same four fields:

  • maxOutputTokens / max_output_tokens / MaxOutputTokens
  • maxPromptTokens / max_prompt_tokens / MaxPromptTokens
  • maxContextWindowTokens / max_context_window_tokens / MaxContextWindowTokens
  • modelLimitsId / model_limits_id / ModelLimitsId

✅ Naming conventions properly applied:

  • Node.js: camelCase
  • Python: snake_case (with wire format conversion to camelCase)
  • .NET: PascalCase (with [JsonPropertyName] attributes)
  • Go: PascalCase exported fields (with JSON tags)

✅ JSON serialization consistent:
All SDKs correctly serialize to the same camelCase wire format (maxOutputTokens, maxPromptTokens, maxContextWindowTokens, modelLimitsId).

✅ Optionality handled correctly:
All fields are optional in all SDKs using language-appropriate mechanisms (optional properties, nullable types, omitempty tags).

✅ Documentation aligned:
All SDKs document the override behavior and the modelLimitsId fine-tuned model use case.

Summary

No cross-SDK consistency issues detected. This PR demonstrates excellent multi-language SDK maintenance practices. 🚀

Generated by SDK Consistency Review Agent for issue #966 ·

/// This is useful for fine-tuned models that share the same limits as a base model.
/// </summary>
[JsonPropertyName("modelLimitsId")]
public string? ModelLimitsId { get; set; }
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious to know what others think about this name. We might want to consider extending this to also affect inferred model capabilities (streaming, tool calling, multimodality, etc.), at which point this name is a bit too specific.

We could consider something like ModelCatalogId or ModelProfileId.

/// When set, takes precedence over the default limit resolved from the model's capability catalog entry.
/// </summary>
[JsonPropertyName("maxOutputTokens")]
public int? MaxOutputTokens { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the scope of this? Is it per request to the LLM? Total across all operations performed by the agent until idle?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per individual LLM API request. This value is sent as max_tokens in every call to the model provider. Each turn in the agent loop independently gets this cap, and it's not accumulated across the session. It controls how many tokens the model can generate in a single response (text, tool calls, reasoning) before control returns.

/// When set, takes precedence over the default limit resolved from the model's capability catalog entry.
/// </summary>
[JsonPropertyName("maxPromptTokens")]
public int? MaxPromptTokens { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called MaxInputTokens instead of MaxPromptTokens?

Does this include cached tokens?

Same question as above... is this about one request or across a sequence of calls?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-request, but not sent to the API. The runtime uses this internally to decide when to truncate or compact conversation history before each LLM call. "Prompt" here means everything sent to the model in one request: system message, full conversation history up to that point, tool definitions, and the new user message. Cached tokens are counted toward the limit.

The name matches the upstream CAPI /models field (max_prompt_tokens), though MaxInputTokens would also be reasonable.

/// When set, takes precedence over the default limit resolved from the model's capability catalog entry.
/// </summary>
[JsonPropertyName("maxContextWindowTokens")]
public int? MaxContextWindowTokens { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when these Max values are hit?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each limit is handled differently:

  • MaxOutputTokens hit: The model stops generating and returns stop_reason: "max_tokens". Currently the runtime treats the truncated response as normal (silent failure, but I'm going to propose a fix for that soon).
  • MaxPromptTokens hit: The runtime triggers background compaction before sending the request.
  • MaxContextWindowTokens hit: Largely a fallback for MaxPromptTokens. The runtime prefers max_prompt_tokens for truncation/compaction decisions and only falls back to max_context_window_tokens when the former isn't set. If the API still rejects the request, the runtime adjusts its stored limit downward and retries. It's probably fine to not expose this value as public API in the SDK (we don't make it configurable for BYOK in the CLI).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Removed MaxContextWindowTokens.

Remove maxContextWindowTokens from all SDKs - it is an internal runtime
fallback that should not be exposed as public SDK API.

Refine doc comments for maxOutputTokens and maxPromptTokens to explain
what happens when each limit is hit:
- maxOutputTokens: sent as max_tokens per LLM request; model stops
  generating and returns a truncated response when hit.
- maxPromptTokens: used by the runtime to trigger conversation
  compaction before sending a request when the prompt exceeds this limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review ✅

I've reviewed this PR for consistency across all four SDK implementations. Overall, this is excellent work maintaining feature parity!

✅ What's Consistent

All four SDKs (Node.js, Python, Go, .NET) correctly implement the same three fields with:

  • Proper naming conventions for each language (camelCase, snake_case, PascalCase)
  • Consistent JSON wire format (maxOutputTokens, maxPromptTokens, modelLimitsId)
  • Equivalent documentation explaining the purpose of each field
  • Correct optionality (all fields are optional)
  • Python wire format conversion properly handles snake_case → camelCase mapping in client.py

📝 Minor Documentation Note

The PR description mentions four fields including maxContextWindowTokens, but the actual implementation only adds three fields. The maxContextWindowTokens field is not present in any SDK. This appears to be a copy/paste error in the PR description table rather than a code inconsistency.

No action needed - the code is consistent across all SDKs. Just noting the discrepancy between the description and implementation for clarity.

Generated by SDK Consistency Review Agent for issue #966 ·

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review ✅

I've reviewed this PR for consistency across all four SDK implementations. Overall, this is an excellent example of coordinated multi-SDK development with strong feature parity! 🎉

✅ What's Consistent

All four SDKs consistently implement the same fields with proper:

  • Naming conventions: camelCase (Node.js/TypeScript), snake_case (Python), PascalCase (Go/exported, .NET)
  • JSON serialization: All use maxOutputTokens, maxPromptTokens, modelLimitsId wire names
  • Type mappings: number/int?/int/str appropriately chosen per language
  • Documentation: Clear, consistent comments explaining each field's purpose
  • Optionality: All fields are optional/nullable as expected

Python's _convert_provider_to_wire_format correctly maps snake_case to camelCase for wire protocol.

⚠️ Discrepancy Found: Missing Field

The PR description table lists four fields:

  • maxOutputTokens
  • maxPromptTokens
  • maxContextWindowTokensMissing from implementation
  • modelLimitsId

However, the actual code only implements three fields. maxContextWindowTokens is mentioned in the description but not present in any SDK's implementation.

Questions:

  1. Was maxContextWindowTokens intentionally excluded from this PR?
  2. If it's needed, should it be added to all four SDKs in this PR or a follow-up?
  3. If it's not needed, should the PR description be updated to remove it from the table?

Recommendation: Either add maxContextWindowTokens to all SDKs or update the PR description to reflect the actual three-field implementation. Otherwise, everything looks great! 🚀

Generated by SDK Consistency Review Agent for issue #966 ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants