Upgrade ONNX from 1.19 to 1.21 by ajrasane · Pull Request #1207 · NVIDIA/Model-Optimizer

ajrasane · 2026-04-08T18:38:48Z

What does this PR do?

Type of change: new feature

Upgrade ONNX dependency from ~=1.19.0 to ~=1.21.0. ONNX 1.20+ removed several deprecated
helper functions (float32_to_bfloat16, float32_to_float8e4m3, pack_float32_to_4bit) that
onnx_graphsurgeon 0.5.x still references at import time. This PR adds a compatibility shim
(modelopt/onnx/_onnx_compat.py) that restores these functions using ml_dtypes before any
onnx_graphsurgeon import occurs. This supersedes the partial inline fix from #1204 by also
handling float32_to_float8e4m3.

Changes:

Bump onnx~=1.19.0 to onnx~=1.21.0 in pyproject.toml
Add modelopt/onnx/_onnx_compat.py compatibility shim for removed ONNX APIs
Import shim in modelopt/onnx/__init__.py and tests/unit/onnx/conftest.py
Remove usage of removed onnx.helper.pack_float32_to_4bit in test_quant_utils.py
Update example requirements (genai_llm, whisper) to onnx==1.21.0

TensorRT Compatibility: TRT 10.16-GA supports opsets 9–24. ModelOpt quantization modes
use opsets 19–23, all within range. ONNX 1.21 does not force opset 26.

Usage

# No API changes — the upgrade is transparent to users.
# The compatibility shim is applied automatically on import.
import modelopt.onnx

Testing

469/470 ONNX unit tests pass inside nvcr.io/nvidia/tensorrt:25.06-py3 (1 pre-existing ORT CopyTensorAsync EP issue, not ONNX-related)
6/6 torch_onnx integration tests pass (fp8, int8, nvfp4, mxfp8, int4_awq, auto)
ViT FP8 quantization via torch_onnx → TRT engine build → ImageNet eval: 85.3% top-1, 97.8% top-5
ViT FP8 quantization via onnx_ptq → TRT engine build succeeds
All pre-commit hooks pass (ruff, mypy, bandit, license headers)

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅
Did you write any new necessary tests?: ✅ (updated existing tests, added conftest.py for compat shim)
Did you update Changelog?: ❌ (dependency upgrade, no API change)

Additional Information

Related: #1204 (partial fix for float32_to_bfloat16 only — this PR supersedes it with full coverage)

🤖 Generated with Claude Code

Summary by CodeRabbit

Dependencies
- Removed unpinned ONNX from example requirement files and updated the ONNX optional dependency to ~=1.21.0.
Refactor
- Centralized an ONNX compatibility shim to restore missing helper APIs when needed.
Tests
- Added tests for the compatibility shim, adjusted quantization tests to remove reliance on removed ONNX helpers, and ensured shim runs before related tests.

coderabbitai · 2026-04-08T18:40:47Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Added a dedicated ONNX compatibility shim module and invoked it on import; removed unpinned onnx from two example requirements; bumped the onnx optional dependency to ~1.21.0; tests updated to import the shim and to stop depending on removed ONNX helper APIs.

Changes

Cohort / File(s)	Summary
Example requirements `examples/windows/onnx_ptq/genai_llm/requirements.txt`, `examples/windows/onnx_ptq/whisper/requirements.txt`	Removed the unpinned `onnx` entry from example requirements.
Project dependency spec `pyproject.toml`	Updated `project.optional-dependencies.onnx` from `onnx~=1.19.0` to `onnx~=1.21.0`.
ONNX compatibility shim `modelopt/onnx/__init__.py`, `modelopt/onnx/_onnx_compat.py`	Replaced inline monkey-patch with import of new `_onnx_compat.patch_onnx_helper_removed_apis()`; added `_onnx_compat` which conditionally defines missing `onnx.helper` helpers (`float32_to_bfloat16`, `float32_to_float8e4m3`) using `numpy` and `ml_dtypes` when available.
Tests — conftest & compat tests `tests/unit/onnx/conftest.py`, `tests/unit/onnx/test_onnx_compat.py`	Added `conftest.py` to import the shim at pytest startup; added unit tests covering shim installation, idempotency/non-overwrite, and behavior when `ml_dtypes` is unavailable.
Tests — quantization changes `tests/unit/onnx/quantization/test_quant_utils.py`	Removed reliance on `onnx.helper.pack_float32_to_4bit`; tests now compare internal implementations and assert explicit expected bytes for unsigned boundary cases.

Sequence Diagram(s)

sequenceDiagram
    participant Importer as Test / Module Import
    participant Shim as modelopt.onnx._onnx_compat
    participant ONNX as onnx.helper
    participant NumPy as numpy / ml_dtypes

    Importer->>Shim: import (side-effect)
    Shim->>ONNX: try import `onnx.helper`
    alt helpers missing
        Shim->>NumPy: import `numpy`, `ml_dtypes`
        NumPy-->>Shim: provide conversion dtypes/utilities
        Shim->>ONNX: define `float32_to_bfloat16` and `float32_to_float8e4m3`
    else imports fail
        Shim-->>Importer: no patch applied (silent)
    end
    Importer->>ONNX: subsequent code/tests may use helpers

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: upgrading ONNX from version 1.19 to 1.21, which is the central objective reflected throughout the changeset.
Security Anti-Patterns	✅ Passed	The PR introduces no security anti-patterns from SECURITY.md. The compatibility shim safely patches ONNX helper functions using numpy type conversions with proper import error handling.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ajrasane/onnx_version_upgrade

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-08T18:43:07Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1207/
Built to branch `gh-pages` at 2026-04-09 19:52 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/onnx/_onnx_compat.py`:
- Around line 36-54: The shimmed helper functions use incomplete signatures
causing TypeError when callers pass legacy kwargs; update _float32_to_bfloat16
and _float32_to_float8e4m3 to match ONNX 1.19.x: change
_float32_to_bfloat16(value) to _float32_to_bfloat16(fval, truncate: bool =
False) and change _float32_to_float8e4m3(value, fn=True, uz=False) to
_float32_to_float8e4m3(fval, scale: float = 1.0, fn: bool = True, uz: bool =
False, saturate: bool = True); inside each function use fval, apply scale for
float8 (e.g., np.float32(fval) * scale) and accept but safely ignore
truncate/saturate if not needed, then perform the existing dtype conversion and
return the same integer representation; update the assignments
onnx.helper.float32_to_bfloat16 and onnx.helper.float32_to_float8e4m3 to
reference the renamed/updated functions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 68e21319-7789-4c62-a7c0-a6d5a73e29d5

📥 Commits

Reviewing files that changed from the base of the PR and between abf4558 and 9356aec.

📒 Files selected for processing (7)

examples/windows/onnx_ptq/genai_llm/requirements.txt
examples/windows/onnx_ptq/whisper/requirements.txt
modelopt/onnx/__init__.py
modelopt/onnx/_onnx_compat.py
pyproject.toml
tests/unit/onnx/conftest.py
tests/unit/onnx/quantization/test_quant_utils.py

modelopt/onnx/_onnx_compat.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/onnx/__init__.py`:
- Around line 21-22: The import of the ONNX compatibility shim
(patch_onnx_helper_removed_apis) is happening before the optional-deps
try/except guard and will raise if onnx/ml_dtypes are missing; wrap the shim
import in the same optional-deps try/except (or move it inside that try block)
so ImportError is caught and handled by your normalized error reporting logic,
i.e., import modelopt.onnx._onnx_compat.patch_onnx_helper_removed_apis inside
the existing try/except that checks for onnx and ml_dtypes rather than at module
top-level.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 584f28d9-70d9-4199-88eb-5662e6e25976

📥 Commits

Reviewing files that changed from the base of the PR and between 9356aec and 4dfe4c1.

📒 Files selected for processing (7)

examples/windows/onnx_ptq/genai_llm/requirements.txt
examples/windows/onnx_ptq/whisper/requirements.txt
modelopt/onnx/__init__.py
modelopt/onnx/_onnx_compat.py
pyproject.toml
tests/unit/onnx/conftest.py
tests/unit/onnx/quantization/test_quant_utils.py

💤 Files with no reviewable changes (2)

examples/windows/onnx_ptq/whisper/requirements.txt
examples/windows/onnx_ptq/genai_llm/requirements.txt

✅ Files skipped from review due to trivial changes (2)

pyproject.toml
tests/unit/onnx/conftest.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/unit/onnx/quantization/test_quant_utils.py
modelopt/onnx/_onnx_compat.py

modelopt/onnx/__init__.py

codecov · 2026-04-08T18:57:59Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.34%. Comparing base (9b4f43a) to head (845fde9).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1207      +/-   ##
==========================================
+ Coverage   75.42%   77.34%   +1.92%     
==========================================
  Files         353      353              
  Lines       40603    40595       -8     
==========================================
+ Hits        30623    31398     +775     
+ Misses       9980     9197     -783

Flag	Coverage Δ
examples	`44.40% <ø> (+2.63%)`	⬆️
gpu	`56.89% <ø> (-0.21%)`	⬇️
unit	`55.10% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevalmorabia97

LGTM. PTAL at Coderabbit comments as well

ONNX 1.20+ removed deprecated helper functions (float32_to_bfloat16, float32_to_float8e4m3, pack_float32_to_4bit) that onnx_graphsurgeon 0.5.x still references at import time. This adds a compatibility shim that restores these functions using ml_dtypes before any onnx_graphsurgeon imports occur. Changes: - Bump onnx~=1.19.0 to onnx~=1.21.0 in pyproject.toml - Add modelopt/onnx/_onnx_compat.py compatibility shim for removed APIs - Import shim in modelopt/onnx/__init__.py and test conftest.py - Update test_quant_utils.py to remove usage of removed onnx.helper.pack_float32_to_4bit; validate against hardcoded expected values instead - Update example requirements (genai_llm, whisper) to onnx==1.21.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

coderabbitai

♻️ Duplicate comments (1)

modelopt/onnx/__init__.py (1)
21-22: ⚠️ Potential issue | 🟠 Major

Move the compatibility shim import inside the try/except guard.

The import of patch_onnx_helper_removed_apis at line 22 occurs before the dependency guard (lines 26–30). If onnx or ml_dtypes is missing, this could fail early and bypass the normalized optional-dependency error message.

As noted in the previous review, move the import inside the existing try/except block to ensure consistent error handling:
Proposed fix
-# Apply ONNX compatibility shim before any onnx_graphsurgeon imports.
-from modelopt.onnx._onnx_compat import patch_onnx_helper_removed_apis
-
 MIN_PYTHON_VERSION = (3, 10)
 
 try:
+    # Apply ONNX compatibility shim before any onnx_graphsurgeon imports.
+    from modelopt.onnx._onnx_compat import patch_onnx_helper_removed_apis
+    patch_onnx_helper_removed_apis()
     from . import quantization
     from .logging_config import configure_logging, logger
 except ImportError as e:
     raise ImportError(f"{e}\nPlease install optional ``[onnx]`` dependencies.")
As per coding guidelines: modelopt/**/*.py should "gate optional dependencies" rather than importing at module top-level before the guard.

,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/onnx/__init__.py` around lines 21 - 22, The import of
patch_onnx_helper_removed_apis is executed before the optional-dependency guard,
which can raise errors before the normalized message; move the import statement
into the existing try/except that checks for onnx and ml_dtypes so the
compatibility shim is applied only when those deps are present and any
ImportError is handled by the guard. Locate the top-level import of
patch_onnx_helper_removed_apis in modelopt.onnx.__init__ and relocate it inside
the same try block that imports onnx and ml_dtypes (the dependency guard),
ensuring the except block still raises the normalized optional-dependency error
if imports fail. Ensure no other code references the shim before the try block
so behavior remains unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@modelopt/onnx/__init__.py`:
- Around line 21-22: The import of patch_onnx_helper_removed_apis is executed
before the optional-dependency guard, which can raise errors before the
normalized message; move the import statement into the existing try/except that
checks for onnx and ml_dtypes so the compatibility shim is applied only when
those deps are present and any ImportError is handled by the guard. Locate the
top-level import of patch_onnx_helper_removed_apis in modelopt.onnx.__init__ and
relocate it inside the same try block that imports onnx and ml_dtypes (the
dependency guard), ensuring the except block still raises the normalized
optional-dependency error if imports fail. Ensure no other code references the
shim before the try block so behavior remains unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8161e52e-c1d5-4f7f-b478-a92ae3efd27c

📥 Commits

Reviewing files that changed from the base of the PR and between 4dfe4c1 and 68ed5a2.

📒 Files selected for processing (7)

examples/windows/onnx_ptq/genai_llm/requirements.txt
examples/windows/onnx_ptq/whisper/requirements.txt
modelopt/onnx/__init__.py
modelopt/onnx/_onnx_compat.py
pyproject.toml
tests/unit/onnx/conftest.py
tests/unit/onnx/quantization/test_quant_utils.py

💤 Files with no reviewable changes (2)

examples/windows/onnx_ptq/whisper/requirements.txt
examples/windows/onnx_ptq/genai_llm/requirements.txt

✅ Files skipped from review due to trivial changes (3)

pyproject.toml
tests/unit/onnx/conftest.py
modelopt/onnx/_onnx_compat.py

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

hthadicherla · 2026-04-09T10:17:50Z

Seems like this test is failing
tests/unit/torch/quantization/test_quantize_cpu.py::test_save_restore[SimpleLinear-quant_config3] FAILED [ 72%].

But looks unrelated to the change, since i saw multiple PR's fail yesterday at this same test.

kevalmorabia97 · 2026-04-09T11:32:37Z

tests/unit/torch/quantization/test_quantize_cpu.py::test_save_restore is disabled on Windows now in main branch as its flaky

… onnx 1.21.0 onnx-graphsurgeon 0.6.1 no longer references the removed onnx.helper functions (float32_to_bfloat16, float32_to_float8e4m3), so the compatibility shim and its tests are no longer needed. Pin onnx-graphsurgeon>=0.6.1 in pyproject.toml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

kevalmorabia97 · 2026-04-09T15:56:57Z

tests/unit/onnx/conftest.py

not needed anymore

cjluo-nv

Please address @kevalmorabia97 's comment before landing

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

…_template (#1225) ## Summary - `apply_chat_template(..., return_tensors="pt")` returns a `BatchEncoding` in transformers 4.46+, which no longer subclasses `dict` - The old guard `isinstance(tokenized, dict)` evaluates to `False` for `BatchEncoding`, so `input_ids` was set to the whole `BatchEncoding` object - Calling `.shape[1]` on a `BatchEncoding` triggers `__getattr__("shape")` → `AttributeError` - Fix: check `isinstance(tokenized, torch.Tensor)` instead, which correctly handles both old transformers (plain tensor) and new transformers (BatchEncoding) This is causing `test_collect_hidden_states` to fail in the speculative decoding CI for all open PRs (#1207, #1210, #1221). ## Test plan - [ ] `torch-pr (speculative_decoding, 26.01)` CI passes - [ ] Verify fix handles both `torch.Tensor` return (old transformers) and `BatchEncoding` return (new transformers 4.46+) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Ye Yu <yeyu@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

ajrasane requested review from a team as code owners April 8, 2026 18:38

ajrasane requested review from kevalmorabia97, vishalpandya1990 and ynankani April 8, 2026 18:38

ajrasane force-pushed the ajrasane/onnx_version_upgrade branch from 9356aec to 4dfe4c1 Compare April 8, 2026 18:44

coderabbitai bot reviewed Apr 8, 2026

View reviewed changes

modelopt/onnx/_onnx_compat.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Apr 8, 2026

View reviewed changes

modelopt/onnx/__init__.py Outdated Show resolved Hide resolved

kevalmorabia97 approved these changes Apr 8, 2026

View reviewed changes

ajrasane force-pushed the ajrasane/onnx_version_upgrade branch from 4dfe4c1 to 68ed5a2 Compare April 8, 2026 19:35

coderabbitai bot reviewed Apr 8, 2026

View reviewed changes

ajrasane added 2 commits April 8, 2026 20:11

Add a test for onnx compatiblity shim

0f3cb8e

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

Fix coderabbit comments

3fd3336

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

ajrasane force-pushed the ajrasane/onnx_version_upgrade branch from 05c1a3e to 3fd3336 Compare April 8, 2026 20:47

Merge branch 'main' into ajrasane/onnx_version_upgrade

ae4bbb9

ajrasane enabled auto-merge (squash) April 9, 2026 15:19

ajrasane disabled auto-merge April 9, 2026 15:33

ajrasane enabled auto-merge (squash) April 9, 2026 15:54

kevalmorabia97 reviewed Apr 9, 2026

View reviewed changes

tests/unit/onnx/conftest.py Outdated

Copy link
Copy Markdown

Collaborator

kevalmorabia97 Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed anymore

ajrasane reacted with thumbs up emoji

cjluo-nv approved these changes Apr 9, 2026

View reviewed changes

yeyu-nvidia mentioned this pull request Apr 9, 2026

Fix compute_hidden_states_hf.py: handle BatchEncoding from apply_chat_template #1225

Merged

2 tasks

Merge branch 'main' into ajrasane/onnx_version_upgrade

6de81c6

remove conftest.py

9eb8409

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

Merge branch 'main' into ajrasane/onnx_version_upgrade

845fde9

Conversation

ajrasane commented Apr 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Built to branch gh-pages at 2026-04-09 19:52 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevalmorabia97 left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

hthadicherla commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevalmorabia97 commented Apr 9, 2026

Uh oh!

kevalmorabia97 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

cjluo-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ajrasane commented Apr 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 8, 2026 •

edited

Loading

github-actions bot commented Apr 8, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-04-09 19:52 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

codecov bot commented Apr 8, 2026 •

edited

Loading

hthadicherla commented Apr 9, 2026 •

edited

Loading