GPTQ test by sugunav14 · Pull Request #1179 · NVIDIA/Model-Optimizer

sugunav14 · 2026-04-06T03:01:11Z

What does this PR do?

Type of change: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

Release Notes

New Features
- Added vLLM fake quantization export support
- Added optional activation MSE measurement during quantization
- Added optional perplexity evaluation after quantization
- Added sequential calibration checkpointing and resume capabilities
- Added dataset sampling skip parameter
Refactor
- Reimplemented GPTQ quantization algorithm with Hessian-informed blockwise weight updates and improved calibration handling

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

…ntizer, NVFP4MSECalibrator (#849) **Type of change:** ?  **Overview:** ?  ```python ```   - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No    * **New Features** * Added NVFP4StaticQuantizer for improved 4-bit quantization with enhanced precision control * Introduced NVFP4MSECalibrator with flexible candidate generation for calibration optimization * **Improvements** * Optimized GPU kernels for Hopper+ graphics cards with better performance * Extended Triton support to broader GPU compatibility * Enhanced backward compatibility for restoring previously quantized models * **Tests** * Added comprehensive test coverage for new quantizers and calibration methods  --------- Signed-off-by: realAsma <akuriparambi@nvidia.com> Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

…FP4QTensor Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com> Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

copy-pr-bot · 2026-04-06T03:01:15Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-04-06T03:01:21Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c4bc4b1b-b500-4183-a7ff-78616e0119b2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR replaces GPTQ-Lite with a new GPTQ implementation featuring Hessian-informed blockwise weight updates, adds sequential calibration checkpointing and resume capabilities, introduces activation MSE measurement and perplexity evaluation to the example quantization script, and extends dataset utilities and vLLM serving configurations to support these features.

Changes

Cohort / File(s)	Summary
GPTQ Core Implementation `modelopt/torch/quantization/config.py`, `modelopt/torch/quantization/mode.py`, `modelopt/torch/quantization/model_calib.py`	Replaced `GPTQLiteConfig` with `GPTQConfig` (method: "gptq_lite" → "gptq"), removed `hessian_state_path` field, and added comprehensive new `gptq()` function with `GPTQHelper` class for per-layer Hessian collection, inverse preparation, and blockwise weight updates. Updated mode descriptor and removed old GPTQ helper functions.
Sequential Calibration Infrastructure `modelopt/torch/quantization/utils/checkpoint.py`, `modelopt/torch/quantization/utils/activation_collector.py`, `modelopt/torch/quantization/utils/core_utils.py`	Introduced new checkpoint module with progress tracking, saver registration, and resume detection. Enhanced activation collector with `prepare_for_resume()` API, stricter mode typing, and `output_meta` capture control. Added `disabled_weight_quantizers()` context manager for temporary quantizer disabling.
Supporting Quantization Changes `modelopt/torch/quantization/model_quant.py`	Removed hard-coded GPTQ-Lite algorithm override in `get_auto_quantize_config`, allowing algorithm config from `_get_auto_quantize_config` to propagate unchanged.
Dataset Utilities `modelopt/torch/utils/dataset_utils.py`	Added `skip_samples` parameter to `get_dataset_samples()` and `get_dataset_dataloader()` for skipping initial raw dataset entries; refactored sample collection logic to use separate "collected" counter.
Example: HF Quantization `examples/llm_ptq/hf_ptq.py`	Added MSE activation measurement via new `_make_mse_holdout_dataloader()`, optional perplexity evaluation, conditional vLLM fake-quant export path, dynamic quantization format resolution from `mtq` namespace, and adjusted preview generation lengths. Introduced CLI flags: `--vllm_fakequant_export`, `--eval_perplexity`, `--eval_perplexity_seq_len`, `--measure_activation_mse`, `--activation_mse_max_samples`, `--activation_mse_save_dir`, `--activation_mse_input_path`, `--fold_weights`.
vLLM Serving `examples/vllm_serve/fakequant_worker.py`, `examples/vllm_serve/vllm_serve_fakequant.py`	Added `skip_fold_weight` configuration flag and conditional weight-folding bypass in calibration loop with per-module weight-quantizer disabling. Extended Ray environment variables to include `SKIP_FOLD_WEIGHT`.
Tests `tests/gpu/torch/quantization/test_gptq.py`, `tests/gpu/torch/quantization/test_gptq_vq.py`	Updated `test_gptq.py` for new GPTQ API with public `gptq()` function, added export roundtrip validation, changed e2e config to use new GPTQ format. Added new `test_gptq_vq.py` comparing GPTQ vs RTN using PSX LUTS VQ presets with NMSE validation.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/Script
    participant SeqCal as Sequential<br/>Calibrator
    participant ActivCol as Activation<br/>Collector
    participant CheckMgr as Checkpoint<br/>Manager
    participant Model as Quantized<br/>Model
    
    User->>SeqCal: Start sequential calibration<br/>(with checkpoint_dir)
    SeqCal->>CheckMgr: detect_sequential_resume_layer()
    alt Checkpoint exists
        CheckMgr-->>SeqCal: resume_layer_idx > 0
        SeqCal->>ActivCol: prepare_for_resume(resume_layer_idx)
        ActivCol->>Model: Warmup forward pass<br/>(set layer modes, capture output_meta)
    else No checkpoint
        CheckMgr-->>SeqCal: resume_layer_idx = 0
    end
    
    loop For each layer [resume_idx..total_layers)
        SeqCal->>ActivCol: Set layer capture mode
        SeqCal->>Model: Forward pass (collect activations)
        SeqCal->>SeqCal: Run calibration (gptq/rtn/etc)
        SeqCal->>Model: Update layer weights
        SeqCal->>CheckMgr: should_save_seq_calib_checkpoint()?
        alt Save interval reached
            CheckMgr->>CheckMgr: save_sequential_checkpoint()
            CheckMgr->>Model: Attach progress metadata
            CheckMgr->>CheckMgr: Invoke registered saver
        end
    end
    
    SeqCal-->>User: Calibration complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Security Anti-Patterns	❌ Error	Pull request contains critical security anti-patterns: hardcoded trust_remote_code=True enabling arbitrary code execution and torch.load() calls without weights_only=True parameter creating pickle deserialization vulnerabilities.	Replace hardcoded trust_remote_code=True with configurable parameters defaulting to False; add explicit weights_only=True to torch.load() calls; validate amax_file_path sources are trusted and document accordingly.
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.27% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title "GPTQ test" is vague and generic, using non-descriptive terminology that doesn't convey meaningful information about the actual changeset scope.	Use a more descriptive title that captures the main changes, such as: "Refactor GPTQ to sequential calibration with Hessian-informed weight updates and perplexity evaluation"

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch svelury/gptq-vq-f

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

modelopt/torch/utils/dataset_utils.py (1)
221-253: ⚠️ Potential issue | 🟠 Major

skip_samples is currently ignored for .jsonl datasets (and negative values are not validated).

The early return on Line 252-253 bypasses the new skip_samples behavior, so resume/offset behavior is inconsistent for JSONL inputs.
Proposed fix
 def get_dataset_samples(
     dataset_name: str,
     num_samples: int,
@@
     skip_samples: int = 0,
 ) -> list[str]:
@@
+    if skip_samples < 0:
+        raise ValueError("skip_samples must be >= 0")
+
     # Local JSONL file path support (each line is a JSON object with a `text` field).
     if dataset_name.endswith(".jsonl"):
-        return get_jsonl_text_samples(dataset_name, num_samples, key="text")
+        return get_jsonl_text_samples(
+            dataset_name, num_samples + skip_samples, key="text"
+        )[skip_samples:]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/utils/dataset_utils.py` around lines 221 - 253, The .jsonl
early-return ignores skip_samples and lacks validation; update the branch where
dataset_name.endswith(".jsonl") to call get_jsonl_text_samples with skip_samples
applied (i.e., pass the skip/offset so it skips the requested number of records)
and validate skip_samples is non-negative before use (raise or clamp on
negative). Modify the logic in the function that contains the
dataset_name.endswith(".jsonl") check (and any helper call to
get_jsonl_text_samples) so JSONL inputs honor the same skip_samples semantics as
other dataset types.
tests/gpu/torch/quantization/test_gptq.py (1)
237-259: ⚠️ Potential issue | 🟠 Major

This turns the GPU e2e test into a very heavy job.

use_sequential=True makes GPTQ replay full-model forwards per decoder layer, and this test still calibrates TinyLlama with 512 samples while being parametrized over three configs. That is a large jump in runtime and flakiness for a tests/gpu/ check. Please shrink this to a smoke-sized calibration set/model or move the full sequential path behind a slower opt-in test.

As per coding guidelines, "GPU-based unit tests for core ModelOpt library should be placed in tests/gpu/ and run in a few seconds".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/gpu/torch/quantization/test_gptq.py` around lines 237 - 259, The test
config currently sets quant_cfg["algorithm"] = {"method": "gptq",
"use_sequential": True} and builds a calibration dataloader with
get_dataset_dataloader(num_samples=512), which makes the GPU e2e very heavy;
change the test to a smoke-sized run by setting use_sequential to False or
reducing the calibration dataloader sample size (e.g., num_samples -> 8 or
another small value) and/or using a tiny model fixture so the
create_forward_loop(dataloader=calib_dataloader) + mtq.quantize(model,
quant_cfg, forward_loop=calibrate_loop) executes quickly; alternatively move the
full sequential path behind a marked slow/opt-in test. Ensure you update the
references to quant_cfg["algorithm"], get_dataset_dataloader(...,
num_samples=...), and the mtq.quantize(...) invocation accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/llm_ptq/hf_ptq.py`:
- Around line 726-736: The vLLM fakequant export branch omits the MTP
extra-state, causing incomplete checkpoints for models using MTP; update the
branch where args.vllm_fakequant_export triggers
export_hf_vllm_fq_checkpoint(full_model, export_dir=export_path) to also pass
the mtp_state_dict returned by load_mtp_weights (same as the non-fakequant
branch passes extra_state_dict=mtp_state_dict to export_hf_checkpoint), i.e.,
modify the call to include the extra_state_dict/mtp_state_dict so
export_hf_vllm_fq_checkpoint receives and writes the MTP weights alongside
full_model and export_path.
- Around line 1088-1102: The plugin-config export branch that uses
full_model.save_pretrained() / export_hf_vllm_fq_checkpoint (when args.qformat
not in QUANT_CFG_CHOICES and hasattr(mtq, args.qformat)) skips the post-export
cleanup done by export_quantized(): restore the tokenizer's original
padding/token settings and copy any custom model files/configs required for
trust_remote_code models. Fix by invoking the same post-export cleanup steps
used by export_quantized() after saving (or by calling a shared helper): restore
the tokenizer's original padding/pad_token/padding_side state and then copy over
any custom/model config files into export_path so the artifact matches the
standard export path; ensure this runs for both the export_hf_vllm_fq_checkpoint
and full_model.save_pretrained branches before tokenizer.save_pretrained.
- Around line 1075-1086: The MSE logging is run on the unfused fake-quant model
because folding is only applied later for perplexity; when args.fold_weights is
set and activation MSE measurement is requested, call
mtq.fold_weight(language_model) before invoking
mse_logger.finish(language_model, mse_data) (i.e., apply folding to the same
language_model used for mse collection), then proceed to delete
mse_logger/mse_data and empty the CUDA cache; keep the existing
folding+perplexity block but ensure folding happens earlier when
args.measure_activation_mse (or equivalent flag) is true.

In `@modelopt/torch/quantization/config.py`:
- Around line 1324-1335: The ModeloptField for block_size currently allows zero
and negative values; add validation by setting gt=0 on the block_size
ModeloptField so config schema rejects non-positive sizes, and where group_size
is available (e.g., in the GPTQ config validation path or an initializer that
uses block_size and group_size) add a check that block_size % group_size == 0
and raise a clear validation error; update references around the block_size
ModeloptField and any GPTQ config validator that uses group_size to enforce this
multiple constraint.

In `@modelopt/torch/quantization/model_calib.py`:
- Around line 1983-1994: The code currently calls GPTQHelper.setup() for each
handle then runs forward_loop(model) but only calls GPTQHelper.cleanup()
afterwards, so if forward_loop raises the patched GPTQ forwards remain; wrap the
forward_loop call in a try/finally: after creating gptq_handles and calling
handle.setup() on each, enter the disabled_weight_quantizers(model) context and
call forward_loop(model) inside a try block, and in the finally iterate over
gptq_handles.values() and call handle.cleanup() to ensure GPTQHelper.cleanup()
always runs even on exceptions (referencing gptq_handles, GPTQHelper, setup,
cleanup, forward_loop, and disabled_weight_quantizers).
- Around line 1535-1545: After flattening to input_flat and computing
batch_size, guard the Hessian update against empty batches: if batch_size == 0,
skip the incremental averaging and outer-product steps entirely (do not modify
n_samples or hessian). Specifically, before the lines that scale down hessian
(hessian *= n_samples / (n_samples + batch_size)) and compute scaled_input and
hessian.add_((scaled_input @ scaled_input.t())...), add a check on batch_size
and return/continue from the surrounding function/block so no division or sqrt
by zero occurs and hessian remains unchanged.

In `@modelopt/torch/quantization/utils/activation_collector.py`:
- Around line 369-410: sequential_calibrate currently never checks or uses
resume helpers so resume state (_seq_calib_progress) and layer modes aren't
rebuilt; update sequential_calibrate to call detect_sequential_resume_layer() at
start, and if it returns a resume_layer_idx call
input_getter.prepare_for_resume(resume_layer_idx, forward_loop) before entering
the main layer loop, then set the loop's start index to resume_layer_idx
(instead of 0) so subsequent iterations continue from the resumed layer; ensure
you reference detect_sequential_resume_layer, input_getter.prepare_for_resume,
forward_loop, resume_layer_idx and the existing _seq_calib_progress handling
when adding this logic.

In `@modelopt/torch/quantization/utils/core_utils.py`:
- Around line 713-726: The helper disabled_weight_quantizers only toggles
module.weight_quantizer and misses non-standard weight attributes; update
disabled_weight_quantizers to iterate weight names from
weight_attr_names(module) for each module (in addition to checking
is_quantized_linear(module)), retrieve each attr (e.g., getattr(module,
attr_name, None)), if it has an is_enabled flag call its disable() and record
(module, attr_name) so the finally block can re-enable by calling enable() on
the same attribute's quantizer; ensure you still check existence and is_enabled
before disabling and re-check before enabling to avoid AttributeError.

In `@tests/gpu/torch/quantization/test_gptq_vq.py`:
- Around line 31-37: The _configs_available() helper currently only checks for
RTN_CFG_NAME and causes a hard failure if GPTQ_CFG_NAME is missing; update
_configs_available() to import modelopt.torch.quantization and return True only
when both getattr(mtq, RTN_CFG_NAME, None) and getattr(mtq, GPTQ_CFG_NAME, None)
are not None so the test is skipped unless both presets (RTN_CFG_NAME and
GPTQ_CFG_NAME) are available; reference the existing function name
_configs_available and the symbols RTN_CFG_NAME and GPTQ_CFG_NAME when making
the change.

---

Outside diff comments:
In `@modelopt/torch/utils/dataset_utils.py`:
- Around line 221-253: The .jsonl early-return ignores skip_samples and lacks
validation; update the branch where dataset_name.endswith(".jsonl") to call
get_jsonl_text_samples with skip_samples applied (i.e., pass the skip/offset so
it skips the requested number of records) and validate skip_samples is
non-negative before use (raise or clamp on negative). Modify the logic in the
function that contains the dataset_name.endswith(".jsonl") check (and any helper
call to get_jsonl_text_samples) so JSONL inputs honor the same skip_samples
semantics as other dataset types.

In `@tests/gpu/torch/quantization/test_gptq.py`:
- Around line 237-259: The test config currently sets quant_cfg["algorithm"] =
{"method": "gptq", "use_sequential": True} and builds a calibration dataloader
with get_dataset_dataloader(num_samples=512), which makes the GPU e2e very
heavy; change the test to a smoke-sized run by setting use_sequential to False
or reducing the calibration dataloader sample size (e.g., num_samples -> 8 or
another small value) and/or using a tiny model fixture so the
create_forward_loop(dataloader=calib_dataloader) + mtq.quantize(model,
quant_cfg, forward_loop=calibrate_loop) executes quickly; alternatively move the
full sequential path behind a marked slow/opt-in test. Ensure you update the
references to quant_cfg["algorithm"], get_dataset_dataloader(...,
num_samples=...), and the mtq.quantize(...) invocation accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1ca32742-d698-44d0-a52b-271964afcc02

📥 Commits

Reviewing files that changed from the base of the PR and between df80a0f and c705c24.

📒 Files selected for processing (13)

examples/llm_ptq/hf_ptq.py
examples/vllm_serve/fakequant_worker.py
examples/vllm_serve/vllm_serve_fakequant.py
modelopt/torch/quantization/config.py
modelopt/torch/quantization/mode.py
modelopt/torch/quantization/model_calib.py
modelopt/torch/quantization/model_quant.py
modelopt/torch/quantization/utils/activation_collector.py
modelopt/torch/quantization/utils/checkpoint.py
modelopt/torch/quantization/utils/core_utils.py
modelopt/torch/utils/dataset_utils.py
tests/gpu/torch/quantization/test_gptq.py
tests/gpu/torch/quantization/test_gptq_vq.py

💤 Files with no reviewable changes (1)

modelopt/torch/quantization/model_quant.py

coderabbitai · 2026-04-08T23:51:01Z

examples/llm_ptq/hf_ptq.py

+            if args.vllm_fakequant_export:
+                export_hf_vllm_fq_checkpoint(
+                    full_model,
+                    export_dir=export_path,
+                )
+            else:
+                export_hf_checkpoint(
+                    full_model,
+                    export_dir=export_path,
+                    extra_state_dict=mtp_state_dict,
+                )


⚠️ Potential issue | 🟠 Major

The vLLM fakequant export path drops the extra MTP weights.

load_mtp_weights() returns mtp_state_dict specifically for weights that are not in the normal model state dict, and the regular HF export passes that through extra_state_dict. This new branch ignores it, so models with MTP layers can export incomplete checkpoints.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/llm_ptq/hf_ptq.py` around lines 726 - 736, The vLLM fakequant export branch omits the MTP extra-state, causing incomplete checkpoints for models using MTP; update the branch where args.vllm_fakequant_export triggers export_hf_vllm_fq_checkpoint(full_model, export_dir=export_path) to also pass the mtp_state_dict returned by load_mtp_weights (same as the non-fakequant branch passes extra_state_dict=mtp_state_dict to export_hf_checkpoint), i.e., modify the call to include the extra_state_dict/mtp_state_dict so export_hf_vllm_fq_checkpoint receives and writes the MTP weights alongside full_model and export_path.

coderabbitai · 2026-04-08T23:51:01Z

examples/llm_ptq/hf_ptq.py

+    if mse_logger is not None:
+        mse_logger.finish(language_model, mse_data)
+        del mse_logger, mse_data
+        torch.cuda.empty_cache()
+
+    if args.eval_perplexity and tokenizer is not None:
+        if args.fold_weights:
+            print("Folding weights before perplexity evaluation...")
+            mtq.fold_weight(language_model)
+        eval_data = get_wikitext2(tokenizer, args.eval_perplexity_seq_len)
+        ppl = compute_perplexity(full_model, eval_data)
+        print(f"Wikitext-2 perplexity: {ppl:.2f}")


⚠️ Potential issue | 🟠 Major

--fold_weights is wired to perplexity, not activation-MSE collection.

The flag/help text says folding should happen before collecting the quantized activations, but mse_logger.finish() still runs on the unfused fake-quant model. With --measure_activation_mse --fold_weights, the flag currently has no effect on the MSE pass.

Proposed fix

if mse_logger is not None: + if args.fold_weights: + mtq.fold_weight(language_model) mse_logger.finish(language_model, mse_data) del mse_logger, mse_data torch.cuda.empty_cache() if args.eval_perplexity and tokenizer is not None: - if args.fold_weights: + if args.fold_weights: print("Folding weights before perplexity evaluation...") - mtq.fold_weight(language_model) eval_data = get_wikitext2(tokenizer, args.eval_perplexity_seq_len)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/llm_ptq/hf_ptq.py` around lines 1075 - 1086, The MSE logging is run on the unfused fake-quant model because folding is only applied later for perplexity; when args.fold_weights is set and activation MSE measurement is requested, call mtq.fold_weight(language_model) before invoking mse_logger.finish(language_model, mse_data) (i.e., apply folding to the same language_model used for mse collection), then proceed to delete mse_logger/mse_data and empty the CUDA cache; keep the existing folding+perplexity block but ensure folding happens earlier when args.measure_activation_mse (or equivalent flag) is true.

coderabbitai · 2026-04-08T23:51:01Z

examples/llm_ptq/hf_ptq.py

+    # Plugin-registered configs (e.g. PSX LUTS from modelopt-internal) are not exportable
+    # via the standard TRT-LLM / HF export paths. Fall back to save_pretrained().
+    if args.qformat not in QUANT_CFG_CHOICES and hasattr(mtq, args.qformat):
+        export_path = args.export_path
+        if args.vllm_fakequant_export:
+            print(f"Exporting vLLM fakequant checkpoint (bf16 weights + amax) to: {export_path}")
+            export_hf_vllm_fq_checkpoint(full_model, export_dir=export_path)
+        else:
+            print(
+                f"qformat '{args.qformat}' is a plugin-registered config and is not exportable "
+                f"via the standard export pipeline. Saving with save_pretrained() instead."
+            )
+            full_model.save_pretrained(export_path)
+        if tokenizer is not None:
+            tokenizer.save_pretrained(export_path)


⚠️ Potential issue | 🟠 Major

This plugin-config export bypass skips the normal artifact cleanup.

By bypassing export_quantized(), this branch never restores the tokenizer's original padding settings and never copies custom model files/configs. That makes the saved artifact materially different from the standard export path, especially for trust_remote_code models.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/llm_ptq/hf_ptq.py` around lines 1088 - 1102, The plugin-config export branch that uses full_model.save_pretrained() / export_hf_vllm_fq_checkpoint (when args.qformat not in QUANT_CFG_CHOICES and hasattr(mtq, args.qformat)) skips the post-export cleanup done by export_quantized(): restore the tokenizer's original padding/token settings and copy any custom model files/configs required for trust_remote_code models. Fix by invoking the same post-export cleanup steps used by export_quantized() after saving (or by calling a shared helper): restore the tokenizer's original padding/pad_token/padding_side state and then copy over any custom/model config files into export_path so the artifact matches the standard export path; ensure this runs for both the export_hf_vllm_fq_checkpoint and full_model.save_pretrained branches before tokenizer.save_pretrained.

coderabbitai · 2026-04-08T23:51:01Z

modelopt/torch/quantization/config.py

+    percdamp: float = ModeloptField(
        default=0.01,
        gt=0.0,
        le=1.0,
        title="Percentage damping factor.",
        description="The percentage of average Hessian diagonal used for damping.",
    )
-    block_size: int | None = ModeloptField(
+    block_size: int = ModeloptField(
        default=128,
        title="Block size for GPTQ weight update.",
        description="""The block size for GPTQ weight update, which must be a multiple of the
        group_size used in the quantization.""",


⚠️ Potential issue | 🟡 Minor

Validate block_size at the config boundary.

block_size now accepts 0 and negative values, so an invalid GPTQ config can get through schema validation and only fail later inside the blockwise update path. Please add at least gt=0 here, and ideally enforce the documented group-size multiple constraint wherever that information is available.

Proposed fix

block_size: int = ModeloptField( default=128, + gt=0, title="Block size for GPTQ weight update.", description="""The block size for GPTQ weight update, which must be a multiple of the group_size used in the quantization.""", )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/config.py` around lines 1324 - 1335, The ModeloptField for block_size currently allows zero and negative values; add validation by setting gt=0 on the block_size ModeloptField so config schema rejects non-positive sizes, and where group_size is available (e.g., in the GPTQ config validation path or an initializer that uses block_size and group_size) add a check that block_size % group_size == 0 and raise a clear validation error; update references around the block_size ModeloptField and any GPTQ config validator that uses group_size to enforce this multiple constraint.

coderabbitai · 2026-04-08T23:51:01Z

modelopt/torch/quantization/model_calib.py

+    # Flatten to 2D (total_tokens, features) first, so batch_size counts tokens
+    input_flat = input.reshape(-1, input.shape[-1]).t().float()
+    batch_size = input_flat.shape[1]

    # Incremental averaging: scale down old hessian
    hessian *= n_samples / (n_samples + batch_size)
    n_samples += batch_size

    # Compute outer product: H += (2/n_samples) * X @ X^T
-    # where X is the flattened input reshaped to (features, batch*seq)
-    input_flat = input.reshape(-1, input.shape[-1]).t().float()
    scaled_input = math.sqrt(2 / n_samples) * input_flat
    hessian.add_((scaled_input @ scaled_input.t()).to(hessian.device))


⚠️ Potential issue | 🟠 Major

Ignore empty activation batches before updating the Hessian.

After the reshape, batch_size can be 0 for empty expert/token batches. The divisions on Lines 1540 and 1544 then introduce NaNs/inf into hessian, and every later GPTQ update for that module is corrupted.

Proposed fix

# Flatten to 2D (total_tokens, features) first, so batch_size counts tokens input_flat = input.reshape(-1, input.shape[-1]).t().float() batch_size = input_flat.shape[1] + if batch_size == 0: + return hessian, n_samples # Incremental averaging: scale down old hessian hessian *= n_samples / (n_samples + batch_size)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/model_calib.py` around lines 1535 - 1545, After flattening to input_flat and computing batch_size, guard the Hessian update against empty batches: if batch_size == 0, skip the incremental averaging and outer-product steps entirely (do not modify n_samples or hessian). Specifically, before the lines that scale down hessian (hessian *= n_samples / (n_samples + batch_size)) and compute scaled_input and hessian.add_((scaled_input @ scaled_input.t())...), add a check on batch_size and return/continue from the surrounding function/block so no division or sqrt by zero occurs and hessian remains unchanged.

coderabbitai · 2026-04-08T23:51:01Z

modelopt/torch/quantization/model_calib.py

+    gptq_handles = {name: GPTQHelper(m, name, offload_to_cpu=True) for name, m in quantized_layers}
+    for handle in gptq_handles.values():
+        handle.setup()

-            calib_func(layer, _layer_forward_loop, **calib_kwargs)
+    print_rank_0(f"Computing Hessians for {len(gptq_handles)} linear layers...")

-            del layer_inputs
-            torch.cuda.empty_cache()
-    finally:
-        input_getter._unpatch_all_layers()
+    with disabled_weight_quantizers(model):
+        forward_loop(model)
+
+    for handle in gptq_handles.values():
+        handle.cleanup()
+


⚠️ Potential issue | 🟠 Major

Always unpatch GPTQ forwards in a finally block.

If forward_loop(model) raises here, every patched linear keeps the Hessian wrapper installed and the model is left in a partially mutated state for the rest of the process.

Proposed fix

gptq_handles = {name: GPTQHelper(m, name, offload_to_cpu=True) for name, m in quantized_layers} - for handle in gptq_handles.values(): - handle.setup() - - print_rank_0(f"Computing Hessians for {len(gptq_handles)} linear layers...") - - with disabled_weight_quantizers(model): - forward_loop(model) - - for handle in gptq_handles.values(): - handle.cleanup() + try: + for handle in gptq_handles.values(): + handle.setup() + + print_rank_0(f"Computing Hessians for {len(gptq_handles)} linear layers...") + with disabled_weight_quantizers(model): + forward_loop(model) + finally: + for handle in gptq_handles.values(): + handle.cleanup()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/model_calib.py` around lines 1983 - 1994, The code currently calls GPTQHelper.setup() for each handle then runs forward_loop(model) but only calls GPTQHelper.cleanup() afterwards, so if forward_loop raises the patched GPTQ forwards remain; wrap the forward_loop call in a try/finally: after creating gptq_handles and calling handle.setup() on each, enter the disabled_weight_quantizers(model) context and call forward_loop(model) inside a try block, and in the finally iterate over gptq_handles.values() and call handle.cleanup() to ensure GPTQHelper.cleanup() always runs even on exceptions (referencing gptq_handles, GPTQHelper, setup, cleanup, forward_loop, and disabled_weight_quantizers).

coderabbitai · 2026-04-08T23:51:01Z

modelopt/torch/quantization/utils/activation_collector.py

+    @torch.no_grad()
+    def prepare_for_resume(
+        self,
+        resume_layer_idx: int,
+        forward_loop: ForwardLoop,
+    ):
+        """Set up layer states for resuming sequential calibration from a checkpoint.
+
+        Runs a single warm-up forward pass so that the next call to
+        :meth:`get_input_activations` for ``resume_layer_idx`` produces the
+        correct inputs.  Layers ``0 .. K-2`` run in *original* mode (with
+        ``capture_output_meta`` enabled so skip-mode metadata is populated),
+        layer ``K-1`` in *capture* mode.  After the pass, ``0 .. K-2`` switch
+        to *skip* and ``K-1`` retains its ``collected_inputs`` for the
+        subsequent *run* transition.
+        """
+        if not self._patched:
+            raise RuntimeError(
+                "prepare_for_resume() requires _patch_all_layers() to be called first."
+            )
+        if resume_layer_idx == 0:
+            return
+
+        k = resume_layer_idx
+        preceding = range(k - 1)
+
+        assert self._decoder_layers is not None
+        for i in preceding:
+            self._set_layer_mode(i, "original")
+            self._decoder_layers[i]._seq_calib.capture_output_meta = True
+
+        print_rank_0(
+            f"Running warm-up forward pass for resume "
+            f"(layers 0..{k - 2} original, layer {k - 1} capture)"
+        )
+        self._run_warmup_capture(k - 1, forward_loop)
+
+        for i in preceding:
+            self._decoder_layers[i]._seq_calib.capture_output_meta = False
+            self._set_layer_mode(i, "skip")
+
+        print_rank_0(f"Warm-up complete. Ready to resume from layer {k}.")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Resume helper call sites ==" rg -nP --type=py '\b(detect_sequential_resume_layer|prepare_for_resume)\s*\(' echo echo "== sequential_calibrate implementation ==" sed -n '1545,1615p' modelopt/torch/quantization/model_calib.py

Repository: NVIDIA/Model-Optimizer

Length of output: 3276

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Full sequential_calibrate function (lines 1545-1700) ==" wc -l modelopt/torch/quantization/model_calib.py sed -n '1545,1700p' modelopt/torch/quantization/model_calib.py | head -200 echo echo "== Search for any code calling sequential_calibrate ==" rg -nP --type=py -B2 -A2 'sequential_calibrate\s*\(' | head -50 echo echo "== Search for imports or references to checkpoint resume in model_calib ==" rg -nP --type=py '(detect_sequential_resume_layer|prepare_for_resume|checkpoint|resume)' modelopt/torch/quantization/model_calib.py | head -20

Repository: NVIDIA/Model-Optimizer

Length of output: 10850

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check if any test explicitly tests resume flow ==" rg -nP --type=py '(detect_sequential_resume_layer|prepare_for_resume)' tests/ echo echo "== Check the full activation_collector.py context around prepare_for_resume ==" rg -nP --type=py -B5 -A25 'def prepare_for_resume' modelopt/torch/quantization/utils/activation_collector.py echo echo "== Check if there's any documentation about resume flow ==" find . -name "*.md" -o -name "*.rst" | xargs grep -l "sequential.*resume\|resume.*sequential" 2>/dev/null | head -5

Repository: NVIDIA/Model-Optimizer

Length of output: 118

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check activation_collector.py context around prepare_for_resume ==" sed -n '360,420p' modelopt/torch/quantization/utils/activation_collector.py echo echo "== Check if prepare_for_resume or detect_sequential_resume_layer are imported anywhere ==" rg -nP --type=py 'from.*activation_collector.*import|from.*checkpoint.*import' | grep -E '(prepare_for_resume|detect_sequential_resume_layer)' | head -20 echo echo "== Check all imports of checkpoint module ==" rg -nP --type=py 'import.*checkpoint|from.*checkpoint' modelopt/torch/quantization/ echo echo "== List all public/exported functions in checkpoint.py ==" sed -n '1,100p' modelopt/torch/quantization/utils/checkpoint.py

Repository: NVIDIA/Model-Optimizer

Length of output: 2177

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check all calls to prepare_for_resume method (not just definition) ==" rg -nP --type=py 'prepare_for_resume\s*\(' --no-heading echo echo "== Check all calls to detect_sequential_resume_layer ==" rg -nP --type=py 'detect_sequential_resume_layer\s*\(' --no-heading echo echo "== Check if LayerActivationCollector is used outside of model_calib.py ==" rg -nP --type=py 'LayerActivationCollector' modelopt/torch/quantization/ | grep -v 'test_' | head -20 echo echo "== Search for checkpoint module usage in sequential_calibrate context ==" rg -nP --type=py -B10 -A10 'checkpoint' modelopt/torch/quantization/model_calib.py | head -40

Repository: NVIDIA/Model-Optimizer

Length of output: 5458

prepare_for_resume() is not integrated into sequential calibration.

The new resume helpers (prepare_for_resume() and detect_sequential_resume_layer()) are defined but never called anywhere in the codebase. The main sequential_calibrate() function iterates through all layers from start to end without detecting, loading, or reconstructing resume state. No code path—tests included—exercises resume functionality. This means resuming from a checkpoint will not rebuild the skip/run/capture layer states needed to continue from layer K.

To complete resume support, integrate these helpers into sequential_calibrate():

Call detect_sequential_resume_layer() to check for a saved _seq_calib_progress checkpoint.

If resuming, call input_getter.prepare_for_resume(resume_layer_idx, forward_loop) before the main layer loop.

Adjust the loop to start from resume_layer_idx instead of 0.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/utils/activation_collector.py` around lines 369 - 410, sequential_calibrate currently never checks or uses resume helpers so resume state (_seq_calib_progress) and layer modes aren't rebuilt; update sequential_calibrate to call detect_sequential_resume_layer() at start, and if it returns a resume_layer_idx call input_getter.prepare_for_resume(resume_layer_idx, forward_loop) before entering the main layer loop, then set the loop's start index to resume_layer_idx (instead of 0) so subsequent iterations continue from the resumed layer; ensure you reference detect_sequential_resume_layer, input_getter.prepare_for_resume, forward_loop, resume_layer_idx and the existing _seq_calib_progress handling when adding this logic.

coderabbitai · 2026-04-08T23:51:01Z

modelopt/torch/quantization/utils/core_utils.py

+@contextmanager
+def disabled_weight_quantizers(model: nn.Module):
+    """Disable weight quantizers during hessian collection."""
+    disabled_modules = []
+    for module in model.modules():
+        if is_quantized_linear(module) and module.weight_quantizer.is_enabled:
+            module.weight_quantizer.disable()
+            disabled_modules.append(module)
+    try:
+        yield
+    finally:
+        for module in disabled_modules:
+            module.weight_quantizer.enable()
+


⚠️ Potential issue | 🟠 Major

Disable every weight quantizer, not just module.weight_quantizer.

This helper ignores modules that use non-standard weight attributes like gate_up_proj / down_proj, even though weight_attr_names() in the same file explicitly supports them. In GPTQ that means Hessian collection can still run with some weight quantizers enabled, which skews the update for those supported module types.

Proposed fix

`@contextmanager` def disabled_weight_quantizers(model: nn.Module): """Disable weight quantizers during hessian collection.""" - disabled_modules = [] + disabled_quantizers = [] for module in model.modules(): - if is_quantized_linear(module) and module.weight_quantizer.is_enabled: - module.weight_quantizer.disable() - disabled_modules.append(module) + for weight_name in weight_attr_names(module): + quantizer_name = quantizer_attr_names(weight_name).weight_quantizer + quantizer = getattr(module, quantizer_name, None) + if quantizer is not None and quantizer.is_enabled: + quantizer.disable() + disabled_quantizers.append(quantizer) try: yield finally: - for module in disabled_modules: - module.weight_quantizer.enable() + for quantizer in disabled_quantizers: + quantizer.enable()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/quantization/utils/core_utils.py` around lines 713 - 726, The helper disabled_weight_quantizers only toggles module.weight_quantizer and misses non-standard weight attributes; update disabled_weight_quantizers to iterate weight names from weight_attr_names(module) for each module (in addition to checking is_quantized_linear(module)), retrieve each attr (e.g., getattr(module, attr_name, None)), if it has an is_enabled flag call its disable() and record (module, attr_name) so the finally block can re-enable by calling enable() on the same attribute's quantizer; ensure you still check existence and is_enabled before disabling and re-check before enabling to avoid AttributeError.

coderabbitai · 2026-04-08T23:51:01Z

tests/gpu/torch/quantization/test_gptq_vq.py

+def _configs_available():
+    try:
+        import modelopt.torch.quantization as mtq
+
+        return getattr(mtq, RTN_CFG_NAME, None) is not None
+    except Exception:
+        return False


⚠️ Potential issue | 🟡 Minor

Skip the test only when both presets are unavailable.

_configs_available() only checks RTN_CFG_NAME, but Line 57 also requires GPTQ_CFG_NAME. If RTN is registered and GPTQ is not, this test hard-fails instead of being skipped.

Proposed fix

def _configs_available(): try: import modelopt.torch.quantization as mtq - return getattr(mtq, RTN_CFG_NAME, None) is not None + return ( + getattr(mtq, RTN_CFG_NAME, None) is not None + and getattr(mtq, GPTQ_CFG_NAME, None) is not None + ) except Exception: return False

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/gpu/torch/quantization/test_gptq_vq.py` around lines 31 - 37, The _configs_available() helper currently only checks for RTN_CFG_NAME and causes a hard failure if GPTQ_CFG_NAME is missing; update _configs_available() to import modelopt.torch.quantization and return True only when both getattr(mtq, RTN_CFG_NAME, None) and getattr(mtq, GPTQ_CFG_NAME, None) are not None so the test is skipped unless both presets (RTN_CFG_NAME and GPTQ_CFG_NAME) are available; reference the existing function name _configs_available and the symbols RTN_CFG_NAME and GPTQ_CFG_NAME when making the change.

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Fridah-nv and others added 30 commits March 25, 2026 17:51

add rabbit feedback

1fee97c

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

minor

3f717dd

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

tested perplexity

971b168

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested, revert later

10c16ca

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested

364fd78

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

refactor

5aee517

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

address reviewers feedback, delegate scaling factor calculation to NV…

6a15d0d

…FP4QTensor Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com> Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested perplexity

7b7146b

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested exported checkpoints on 0211

40c14ef

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested nano v3

7a1e006

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

added activation MSE logging

e6df379

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

super v3 run

b81fed8

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

added activationmse logging helper

f3a9524

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

input amax sync added + tested gptq super sft checkpoint

22e2b95

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

checkpoints generated on 0223

10d21ba

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested perplexity

188fa1d

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested, revert later

599227e

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested

60df0d8

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

initial cleanup

f88ba6e

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

cleanup

7b24cd3

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

removed stray prints

b17b917

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

fix rebase issues

8ff8976

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

minor

5815ce8

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested e2e on qwen

b1f1434

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

removed perplexity eval

df6b182

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

update

75a08fe

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

revert later

9e58a6f

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

minor update

16086c7

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

update

9b47e77

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

sugunav14 added 13 commits March 25, 2026 18:09

gptq faster

4ec2433

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

added metrics files, remove later

2b0af3d

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

claude review

ee40b48

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

remove stray files

a175178

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

refactor

a948497

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

claude review + coderabbit review

7e235b4

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

refactor

d1498be

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

stray changes removed

d8b1d93

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Address PR comments

19fc0c2

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

fixed circular import issue

068e8a9

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested e2e on qwen3-8b

2930b55

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

tested e2e on qwen3-8b

b35bc85

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

latest run with export

0f621cd

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

sugunav14 changed the title ~~Svelury/gptq vq f~~ GPTQ test Apr 6, 2026

sugunav14 added 4 commits April 6, 2026 03:37

clean up

af59a55

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

update

d52d614

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

updated e2e test

c705c24

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

new perplexity eval

6d8f9b3

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

sugunav14 marked this pull request as ready for review April 8, 2026 23:42

sugunav14 requested review from a team as code owners April 8, 2026 23:42

sugunav14 requested a review from meenchen April 8, 2026 23:42

sugunav14 marked this pull request as draft April 8, 2026 23:42

coderabbitai bot reviewed Apr 8, 2026

View reviewed changes

update

224f77d

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>

Conversation

sugunav14 commented Apr 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Apr 6, 2026

Uh oh!

coderabbitai bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sugunav14 commented Apr 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 6, 2026 •

edited

Loading