Add gpt-5.3 model support to TiktokenTokenizer #7579

Copilot · 2026-02-12T22:04:17Z

Adds support for the gpt-5.3 model family to TiktokenTokenizer using O200kBase encoding.

Changes

TiktokenTokenizer.cs: Added gpt-5.3 and gpt-5.3- mappings to model encoding dictionaries
TiktokenTests.cs: Added test coverage for gpt-5.3 base and mini variants

Usage

var tokenizer = TiktokenTokenizer.CreateForModel("gpt-5.3");
var tokens = tokenizer.EncodeToIds("Hello, world!");

// Also supports variants
var miniTokenizer = TiktokenTokenizer.CreateForModel("gpt-5.3-mini");

Follows the same pattern established for gpt-5.2 support.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

dnceng.pkgs.visualstudio.com
- Triggering command: /home/REDACTED/work/machinelearning/machinelearning/.dotnet/dotnet dotnet test test/Microsoft.ML.Tokenizers.Tests/Microsoft.ML.Tokenizers.Tests.csproj --filter FullyQualifiedName~TiktokenTests --no-build (dns block)
openaipublic.blob.core.windows.net
- Triggering command: /home/REDACTED/work/machinelearning/machinelearning/.dotnet/dotnet /home/REDACTED/work/machinelearning/machinelearning/.dotnet/dotnet exec --runtimeconfig /home/REDACTED/work/machinelearning/machinelearning/artifacts/bin/Microsoft.ML.Tokenizers.Tests/Debug/net8.0/Microsoft.ML.Tokenizers.Tests.runtimeconfig.json --depsfile /home/REDACTED/work/machinelearning/machinelearning/artifacts/bin/Microsoft.ML.Tokenizers.Tests/Debug/net8.0/Microsoft.ML.Tokenizers.Tests.deps.json /home/REDACTED/work/machinelearning/machinelearning/artifacts/bin/Microsoft.ML.Tokenizers.Tests/Debug/net8.0/testhost.dll --port 45083 --endpoint 127.0.0.1:045083 --role client --parentprocessid 6859 --telemetryoptedin false (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: stephentoub <[email protected]>

Copilot

Pull request overview

This pull request adds support for the GPT-5.3 model family to the TiktokenTokenizer, following the established pattern used for previous GPT model versions (gpt-5.2, gpt-5.1, etc.). The changes enable tokenization for both the base "gpt-5.3" model and its variants (e.g., "gpt-5.3-mini") using the O200kBase encoding.

Changes:

Added gpt-5.3 model mappings to TiktokenTokenizer for both prefix and exact matching
Added GPT5_3 static tokenizer property and comprehensive test coverage for the new model
Included test cases for both base and mini variants in the test suite

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
src/Microsoft.ML.Tokenizers/Model/TiktokenTokenizer.cs	Added "gpt-5.3-" prefix mapping and "gpt-5.3" exact name mapping to model encoding arrays, both using O200kBase encoding
test/Microsoft.ML.Tokenizers.Tests/TiktokenTests.cs	Added GPT5_3 static property and test data entries for gpt-5.3 and gpt-5.3-mini variants in TestAllSupportedModelNames and TestCreationUsingModel methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tarekgh

LGTM!

codecov · 2026-02-12T23:49:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.06%. Comparing base (3604580) to head (cff6f1e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #7579   +/-   ##
=======================================
  Coverage   69.05%   69.06%           
=======================================
  Files        1483     1483           
  Lines      274362   274365    +3     
  Branches    28270    28270           
=======================================
+ Hits       189466   189482   +16     
+ Misses      77510    77498   -12     
+ Partials     7386     7385    -1

Flag	Coverage Δ
Debug	`69.06% <100.00%> (+<0.01%)`	⬆️
production	`63.32% <100.00%> (+<0.01%)`	⬆️
test	`89.51% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...Microsoft.ML.Tokenizers/Model/TiktokenTokenizer.cs	`80.00% <100.00%> (+0.04%)`	⬆️
...est/Microsoft.ML.Tokenizers.Tests/TiktokenTests.cs	`99.09% <100.00%> (+<0.01%)`	⬆️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Initial plan

d46475b

Copilot AI assigned Copilot and stephentoub Feb 12, 2026

Copilot started work on behalf of stephentoub February 12, 2026 22:04 View session

Add gpt-5.3 support in TiktokenTokenizer

cff6f1e

Co-authored-by: stephentoub <[email protected]>

Copilot AI changed the title ~~[WIP] Add gpt-5.3 support similar to gpt-5.2~~ Add gpt-5.3 model support to TiktokenTokenizer Feb 12, 2026

Copilot AI requested a review from stephentoub February 12, 2026 22:12

Copilot finished work on behalf of stephentoub February 12, 2026 22:12

stephentoub approved these changes Feb 12, 2026

View reviewed changes

stephentoub marked this pull request as ready for review February 12, 2026 22:13

stephentoub requested review from Copilot and tarekgh February 12, 2026 22:13

Copilot started reviewing on behalf of stephentoub February 12, 2026 22:13 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

tarekgh approved these changes Feb 12, 2026

View reviewed changes

tarekgh enabled auto-merge (squash) February 13, 2026 00:02

tarekgh merged commit 989f5a0 into main Feb 13, 2026
25 of 31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gpt-5.3 model support to TiktokenTokenizer #7579

Add gpt-5.3 model support to TiktokenTokenizer #7579

Copilot AI commented Feb 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

tarekgh left a comment

Uh oh!

codecov bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add gpt-5.3 model support to TiktokenTokenizer #7579

Add gpt-5.3 model support to TiktokenTokenizer #7579

Conversation

Copilot AI commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Usage

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

tarekgh left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading