Skip to content

GPTQ test#1179

Draft
sugunav14 wants to merge 48 commits intomainfrom
svelury/gptq-vq-f
Draft

GPTQ test#1179
sugunav14 wants to merge 48 commits intomainfrom
svelury/gptq-vq-f

Conversation

@sugunav14
Copy link
Copy Markdown
Contributor

@sugunav14 sugunav14 commented Apr 6, 2026

What does this PR do?

Type of change: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

  • Is this change backward compatible?: ✅ / ❌ / N/A
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
  • Did you write any new necessary tests?: ✅ / ❌ / N/A
  • Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

Release Notes

  • New Features

    • Added vLLM fake quantization export support
    • Added optional activation MSE measurement during quantization
    • Added optional perplexity evaluation after quantization
    • Added sequential calibration checkpointing and resume capabilities
    • Added dataset sampling skip parameter
  • Refactor

    • Reimplemented GPTQ quantization algorithm with Hessian-informed blockwise weight updates and improved calibration handling

Fridah-nv and others added 30 commits March 25, 2026 17:51
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
…ntizer, NVFP4MSECalibrator (#849)

**Type of change:** ? <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->

**Overview:** ?

<!-- You can potentially add a usage example below. -->

```python
```

<!-- Mention how have you tested your change if applicable. -->

<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->

<!-- E.g. related issue. -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

* **New Features**
* Added NVFP4StaticQuantizer for improved 4-bit quantization with
enhanced precision control
* Introduced NVFP4MSECalibrator with flexible candidate generation for
calibration optimization

* **Improvements**
* Optimized GPU kernels for Hopper+ graphics cards with better
performance
  * Extended Triton support to broader GPU compatibility
* Enhanced backward compatibility for restoring previously quantized
models

* **Tests**
* Added comprehensive test coverage for new quantizers and calibration
methods

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: realAsma <akuriparambi@nvidia.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
…FP4QTensor

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 6, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 6, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c4bc4b1b-b500-4183-a7ff-78616e0119b2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR replaces GPTQ-Lite with a new GPTQ implementation featuring Hessian-informed blockwise weight updates, adds sequential calibration checkpointing and resume capabilities, introduces activation MSE measurement and perplexity evaluation to the example quantization script, and extends dataset utilities and vLLM serving configurations to support these features.

Changes

Cohort / File(s) Summary
GPTQ Core Implementation
modelopt/torch/quantization/config.py, modelopt/torch/quantization/mode.py, modelopt/torch/quantization/model_calib.py
Replaced GPTQLiteConfig with GPTQConfig (method: "gptq_lite" → "gptq"), removed hessian_state_path field, and added comprehensive new gptq() function with GPTQHelper class for per-layer Hessian collection, inverse preparation, and blockwise weight updates. Updated mode descriptor and removed old GPTQ helper functions.
Sequential Calibration Infrastructure
modelopt/torch/quantization/utils/checkpoint.py, modelopt/torch/quantization/utils/activation_collector.py, modelopt/torch/quantization/utils/core_utils.py
Introduced new checkpoint module with progress tracking, saver registration, and resume detection. Enhanced activation collector with prepare_for_resume() API, stricter mode typing, and output_meta capture control. Added disabled_weight_quantizers() context manager for temporary quantizer disabling.
Supporting Quantization Changes
modelopt/torch/quantization/model_quant.py
Removed hard-coded GPTQ-Lite algorithm override in get_auto_quantize_config, allowing algorithm config from _get_auto_quantize_config to propagate unchanged.
Dataset Utilities
modelopt/torch/utils/dataset_utils.py
Added skip_samples parameter to get_dataset_samples() and get_dataset_dataloader() for skipping initial raw dataset entries; refactored sample collection logic to use separate "collected" counter.
Example: HF Quantization
examples/llm_ptq/hf_ptq.py
Added MSE activation measurement via new _make_mse_holdout_dataloader(), optional perplexity evaluation, conditional vLLM fake-quant export path, dynamic quantization format resolution from mtq namespace, and adjusted preview generation lengths. Introduced CLI flags: --vllm_fakequant_export, --eval_perplexity, --eval_perplexity_seq_len, --measure_activation_mse, --activation_mse_max_samples, --activation_mse_save_dir, --activation_mse_input_path, --fold_weights.
vLLM Serving
examples/vllm_serve/fakequant_worker.py, examples/vllm_serve/vllm_serve_fakequant.py
Added skip_fold_weight configuration flag and conditional weight-folding bypass in calibration loop with per-module weight-quantizer disabling. Extended Ray environment variables to include SKIP_FOLD_WEIGHT.
Tests
tests/gpu/torch/quantization/test_gptq.py, tests/gpu/torch/quantization/test_gptq_vq.py
Updated test_gptq.py for new GPTQ API with public gptq() function, added export roundtrip validation, changed e2e config to use new GPTQ format. Added new test_gptq_vq.py comparing GPTQ vs RTN using PSX LUTS VQ presets with NMSE validation.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/Script
    participant SeqCal as Sequential<br/>Calibrator
    participant ActivCol as Activation<br/>Collector
    participant CheckMgr as Checkpoint<br/>Manager
    participant Model as Quantized<br/>Model
    
    User->>SeqCal: Start sequential calibration<br/>(with checkpoint_dir)
    SeqCal->>CheckMgr: detect_sequential_resume_layer()
    alt Checkpoint exists
        CheckMgr-->>SeqCal: resume_layer_idx > 0
        SeqCal->>ActivCol: prepare_for_resume(resume_layer_idx)
        ActivCol->>Model: Warmup forward pass<br/>(set layer modes, capture output_meta)
    else No checkpoint
        CheckMgr-->>SeqCal: resume_layer_idx = 0
    end
    
    loop For each layer [resume_idx..total_layers)
        SeqCal->>ActivCol: Set layer capture mode
        SeqCal->>Model: Forward pass (collect activations)
        SeqCal->>SeqCal: Run calibration (gptq/rtn/etc)
        SeqCal->>Model: Update layer weights
        SeqCal->>CheckMgr: should_save_seq_calib_checkpoint()?
        alt Save interval reached
            CheckMgr->>CheckMgr: save_sequential_checkpoint()
            CheckMgr->>Model: Attach progress metadata
            CheckMgr->>CheckMgr: Invoke registered saver
        end
    end
    
    SeqCal-->>User: Calibration complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Security Anti-Patterns ❌ Error Pull request contains critical security anti-patterns: hardcoded trust_remote_code=True enabling arbitrary code execution and torch.load() calls without weights_only=True parameter creating pickle deserialization vulnerabilities. Replace hardcoded trust_remote_code=True with configurable parameters defaulting to False; add explicit weights_only=True to torch.load() calls; validate amax_file_path sources are trusted and document accordingly.
Docstring Coverage ⚠️ Warning Docstring coverage is 76.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title "GPTQ test" is vague and generic, using non-descriptive terminology that doesn't convey meaningful information about the actual changeset scope. Use a more descriptive title that captures the main changes, such as: "Refactor GPTQ to sequential calibration with Hessian-informed weight updates and perplexity evaluation"
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch svelury/gptq-vq-f

Comment @coderabbitai help to get the list of available commands and usage tips.

@sugunav14 sugunav14 changed the title Svelury/gptq vq f GPTQ test Apr 6, 2026
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
@sugunav14 sugunav14 marked this pull request as ready for review April 8, 2026 23:42
@sugunav14 sugunav14 requested review from a team as code owners April 8, 2026 23:42
@sugunav14 sugunav14 requested a review from meenchen April 8, 2026 23:42
@sugunav14 sugunav14 marked this pull request as draft April 8, 2026 23:42
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
modelopt/torch/utils/dataset_utils.py (1)

221-253: ⚠️ Potential issue | 🟠 Major

skip_samples is currently ignored for .jsonl datasets (and negative values are not validated).

The early return on Line 252-253 bypasses the new skip_samples behavior, so resume/offset behavior is inconsistent for JSONL inputs.

Proposed fix
 def get_dataset_samples(
     dataset_name: str,
     num_samples: int,
@@
     skip_samples: int = 0,
 ) -> list[str]:
@@
+    if skip_samples < 0:
+        raise ValueError("skip_samples must be >= 0")
+
     # Local JSONL file path support (each line is a JSON object with a `text` field).
     if dataset_name.endswith(".jsonl"):
-        return get_jsonl_text_samples(dataset_name, num_samples, key="text")
+        return get_jsonl_text_samples(
+            dataset_name, num_samples + skip_samples, key="text"
+        )[skip_samples:]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/utils/dataset_utils.py` around lines 221 - 253, The .jsonl
early-return ignores skip_samples and lacks validation; update the branch where
dataset_name.endswith(".jsonl") to call get_jsonl_text_samples with skip_samples
applied (i.e., pass the skip/offset so it skips the requested number of records)
and validate skip_samples is non-negative before use (raise or clamp on
negative). Modify the logic in the function that contains the
dataset_name.endswith(".jsonl") check (and any helper call to
get_jsonl_text_samples) so JSONL inputs honor the same skip_samples semantics as
other dataset types.
tests/gpu/torch/quantization/test_gptq.py (1)

237-259: ⚠️ Potential issue | 🟠 Major

This turns the GPU e2e test into a very heavy job.

use_sequential=True makes GPTQ replay full-model forwards per decoder layer, and this test still calibrates TinyLlama with 512 samples while being parametrized over three configs. That is a large jump in runtime and flakiness for a tests/gpu/ check. Please shrink this to a smoke-sized calibration set/model or move the full sequential path behind a slower opt-in test.

As per coding guidelines, "GPU-based unit tests for core ModelOpt library should be placed in tests/gpu/ and run in a few seconds".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/gpu/torch/quantization/test_gptq.py` around lines 237 - 259, The test
config currently sets quant_cfg["algorithm"] = {"method": "gptq",
"use_sequential": True} and builds a calibration dataloader with
get_dataset_dataloader(num_samples=512), which makes the GPU e2e very heavy;
change the test to a smoke-sized run by setting use_sequential to False or
reducing the calibration dataloader sample size (e.g., num_samples -> 8 or
another small value) and/or using a tiny model fixture so the
create_forward_loop(dataloader=calib_dataloader) + mtq.quantize(model,
quant_cfg, forward_loop=calibrate_loop) executes quickly; alternatively move the
full sequential path behind a marked slow/opt-in test. Ensure you update the
references to quant_cfg["algorithm"], get_dataset_dataloader(...,
num_samples=...), and the mtq.quantize(...) invocation accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/llm_ptq/hf_ptq.py`:
- Around line 726-736: The vLLM fakequant export branch omits the MTP
extra-state, causing incomplete checkpoints for models using MTP; update the
branch where args.vllm_fakequant_export triggers
export_hf_vllm_fq_checkpoint(full_model, export_dir=export_path) to also pass
the mtp_state_dict returned by load_mtp_weights (same as the non-fakequant
branch passes extra_state_dict=mtp_state_dict to export_hf_checkpoint), i.e.,
modify the call to include the extra_state_dict/mtp_state_dict so
export_hf_vllm_fq_checkpoint receives and writes the MTP weights alongside
full_model and export_path.
- Around line 1088-1102: The plugin-config export branch that uses
full_model.save_pretrained() / export_hf_vllm_fq_checkpoint (when args.qformat
not in QUANT_CFG_CHOICES and hasattr(mtq, args.qformat)) skips the post-export
cleanup done by export_quantized(): restore the tokenizer's original
padding/token settings and copy any custom model files/configs required for
trust_remote_code models. Fix by invoking the same post-export cleanup steps
used by export_quantized() after saving (or by calling a shared helper): restore
the tokenizer's original padding/pad_token/padding_side state and then copy over
any custom/model config files into export_path so the artifact matches the
standard export path; ensure this runs for both the export_hf_vllm_fq_checkpoint
and full_model.save_pretrained branches before tokenizer.save_pretrained.
- Around line 1075-1086: The MSE logging is run on the unfused fake-quant model
because folding is only applied later for perplexity; when args.fold_weights is
set and activation MSE measurement is requested, call
mtq.fold_weight(language_model) before invoking
mse_logger.finish(language_model, mse_data) (i.e., apply folding to the same
language_model used for mse collection), then proceed to delete
mse_logger/mse_data and empty the CUDA cache; keep the existing
folding+perplexity block but ensure folding happens earlier when
args.measure_activation_mse (or equivalent flag) is true.

In `@modelopt/torch/quantization/config.py`:
- Around line 1324-1335: The ModeloptField for block_size currently allows zero
and negative values; add validation by setting gt=0 on the block_size
ModeloptField so config schema rejects non-positive sizes, and where group_size
is available (e.g., in the GPTQ config validation path or an initializer that
uses block_size and group_size) add a check that block_size % group_size == 0
and raise a clear validation error; update references around the block_size
ModeloptField and any GPTQ config validator that uses group_size to enforce this
multiple constraint.

In `@modelopt/torch/quantization/model_calib.py`:
- Around line 1983-1994: The code currently calls GPTQHelper.setup() for each
handle then runs forward_loop(model) but only calls GPTQHelper.cleanup()
afterwards, so if forward_loop raises the patched GPTQ forwards remain; wrap the
forward_loop call in a try/finally: after creating gptq_handles and calling
handle.setup() on each, enter the disabled_weight_quantizers(model) context and
call forward_loop(model) inside a try block, and in the finally iterate over
gptq_handles.values() and call handle.cleanup() to ensure GPTQHelper.cleanup()
always runs even on exceptions (referencing gptq_handles, GPTQHelper, setup,
cleanup, forward_loop, and disabled_weight_quantizers).
- Around line 1535-1545: After flattening to input_flat and computing
batch_size, guard the Hessian update against empty batches: if batch_size == 0,
skip the incremental averaging and outer-product steps entirely (do not modify
n_samples or hessian). Specifically, before the lines that scale down hessian
(hessian *= n_samples / (n_samples + batch_size)) and compute scaled_input and
hessian.add_((scaled_input @ scaled_input.t())...), add a check on batch_size
and return/continue from the surrounding function/block so no division or sqrt
by zero occurs and hessian remains unchanged.

In `@modelopt/torch/quantization/utils/activation_collector.py`:
- Around line 369-410: sequential_calibrate currently never checks or uses
resume helpers so resume state (_seq_calib_progress) and layer modes aren't
rebuilt; update sequential_calibrate to call detect_sequential_resume_layer() at
start, and if it returns a resume_layer_idx call
input_getter.prepare_for_resume(resume_layer_idx, forward_loop) before entering
the main layer loop, then set the loop's start index to resume_layer_idx
(instead of 0) so subsequent iterations continue from the resumed layer; ensure
you reference detect_sequential_resume_layer, input_getter.prepare_for_resume,
forward_loop, resume_layer_idx and the existing _seq_calib_progress handling
when adding this logic.

In `@modelopt/torch/quantization/utils/core_utils.py`:
- Around line 713-726: The helper disabled_weight_quantizers only toggles
module.weight_quantizer and misses non-standard weight attributes; update
disabled_weight_quantizers to iterate weight names from
weight_attr_names(module) for each module (in addition to checking
is_quantized_linear(module)), retrieve each attr (e.g., getattr(module,
attr_name, None)), if it has an is_enabled flag call its disable() and record
(module, attr_name) so the finally block can re-enable by calling enable() on
the same attribute's quantizer; ensure you still check existence and is_enabled
before disabling and re-check before enabling to avoid AttributeError.

In `@tests/gpu/torch/quantization/test_gptq_vq.py`:
- Around line 31-37: The _configs_available() helper currently only checks for
RTN_CFG_NAME and causes a hard failure if GPTQ_CFG_NAME is missing; update
_configs_available() to import modelopt.torch.quantization and return True only
when both getattr(mtq, RTN_CFG_NAME, None) and getattr(mtq, GPTQ_CFG_NAME, None)
are not None so the test is skipped unless both presets (RTN_CFG_NAME and
GPTQ_CFG_NAME) are available; reference the existing function name
_configs_available and the symbols RTN_CFG_NAME and GPTQ_CFG_NAME when making
the change.

---

Outside diff comments:
In `@modelopt/torch/utils/dataset_utils.py`:
- Around line 221-253: The .jsonl early-return ignores skip_samples and lacks
validation; update the branch where dataset_name.endswith(".jsonl") to call
get_jsonl_text_samples with skip_samples applied (i.e., pass the skip/offset so
it skips the requested number of records) and validate skip_samples is
non-negative before use (raise or clamp on negative). Modify the logic in the
function that contains the dataset_name.endswith(".jsonl") check (and any helper
call to get_jsonl_text_samples) so JSONL inputs honor the same skip_samples
semantics as other dataset types.

In `@tests/gpu/torch/quantization/test_gptq.py`:
- Around line 237-259: The test config currently sets quant_cfg["algorithm"] =
{"method": "gptq", "use_sequential": True} and builds a calibration dataloader
with get_dataset_dataloader(num_samples=512), which makes the GPU e2e very
heavy; change the test to a smoke-sized run by setting use_sequential to False
or reducing the calibration dataloader sample size (e.g., num_samples -> 8 or
another small value) and/or using a tiny model fixture so the
create_forward_loop(dataloader=calib_dataloader) + mtq.quantize(model,
quant_cfg, forward_loop=calibrate_loop) executes quickly; alternatively move the
full sequential path behind a marked slow/opt-in test. Ensure you update the
references to quant_cfg["algorithm"], get_dataset_dataloader(...,
num_samples=...), and the mtq.quantize(...) invocation accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1ca32742-d698-44d0-a52b-271964afcc02

📥 Commits

Reviewing files that changed from the base of the PR and between df80a0f and c705c24.

📒 Files selected for processing (13)
  • examples/llm_ptq/hf_ptq.py
  • examples/vllm_serve/fakequant_worker.py
  • examples/vllm_serve/vllm_serve_fakequant.py
  • modelopt/torch/quantization/config.py
  • modelopt/torch/quantization/mode.py
  • modelopt/torch/quantization/model_calib.py
  • modelopt/torch/quantization/model_quant.py
  • modelopt/torch/quantization/utils/activation_collector.py
  • modelopt/torch/quantization/utils/checkpoint.py
  • modelopt/torch/quantization/utils/core_utils.py
  • modelopt/torch/utils/dataset_utils.py
  • tests/gpu/torch/quantization/test_gptq.py
  • tests/gpu/torch/quantization/test_gptq_vq.py
💤 Files with no reviewable changes (1)
  • modelopt/torch/quantization/model_quant.py

Comment on lines +726 to +736
if args.vllm_fakequant_export:
export_hf_vllm_fq_checkpoint(
full_model,
export_dir=export_path,
)
else:
export_hf_checkpoint(
full_model,
export_dir=export_path,
extra_state_dict=mtp_state_dict,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The vLLM fakequant export path drops the extra MTP weights.

load_mtp_weights() returns mtp_state_dict specifically for weights that are not in the normal model state dict, and the regular HF export passes that through extra_state_dict. This new branch ignores it, so models with MTP layers can export incomplete checkpoints.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/llm_ptq/hf_ptq.py` around lines 726 - 736, The vLLM fakequant export
branch omits the MTP extra-state, causing incomplete checkpoints for models
using MTP; update the branch where args.vllm_fakequant_export triggers
export_hf_vllm_fq_checkpoint(full_model, export_dir=export_path) to also pass
the mtp_state_dict returned by load_mtp_weights (same as the non-fakequant
branch passes extra_state_dict=mtp_state_dict to export_hf_checkpoint), i.e.,
modify the call to include the extra_state_dict/mtp_state_dict so
export_hf_vllm_fq_checkpoint receives and writes the MTP weights alongside
full_model and export_path.

Comment on lines +1075 to +1086
if mse_logger is not None:
mse_logger.finish(language_model, mse_data)
del mse_logger, mse_data
torch.cuda.empty_cache()

if args.eval_perplexity and tokenizer is not None:
if args.fold_weights:
print("Folding weights before perplexity evaluation...")
mtq.fold_weight(language_model)
eval_data = get_wikitext2(tokenizer, args.eval_perplexity_seq_len)
ppl = compute_perplexity(full_model, eval_data)
print(f"Wikitext-2 perplexity: {ppl:.2f}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

--fold_weights is wired to perplexity, not activation-MSE collection.

The flag/help text says folding should happen before collecting the quantized activations, but mse_logger.finish() still runs on the unfused fake-quant model. With --measure_activation_mse --fold_weights, the flag currently has no effect on the MSE pass.

Proposed fix
     if mse_logger is not None:
+        if args.fold_weights:
+            mtq.fold_weight(language_model)
         mse_logger.finish(language_model, mse_data)
         del mse_logger, mse_data
         torch.cuda.empty_cache()
 
     if args.eval_perplexity and tokenizer is not None:
-        if args.fold_weights:
+        if args.fold_weights:
             print("Folding weights before perplexity evaluation...")
-            mtq.fold_weight(language_model)
         eval_data = get_wikitext2(tokenizer, args.eval_perplexity_seq_len)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/llm_ptq/hf_ptq.py` around lines 1075 - 1086, The MSE logging is run
on the unfused fake-quant model because folding is only applied later for
perplexity; when args.fold_weights is set and activation MSE measurement is
requested, call mtq.fold_weight(language_model) before invoking
mse_logger.finish(language_model, mse_data) (i.e., apply folding to the same
language_model used for mse collection), then proceed to delete
mse_logger/mse_data and empty the CUDA cache; keep the existing
folding+perplexity block but ensure folding happens earlier when
args.measure_activation_mse (or equivalent flag) is true.

Comment on lines +1088 to +1102
# Plugin-registered configs (e.g. PSX LUTS from modelopt-internal) are not exportable
# via the standard TRT-LLM / HF export paths. Fall back to save_pretrained().
if args.qformat not in QUANT_CFG_CHOICES and hasattr(mtq, args.qformat):
export_path = args.export_path
if args.vllm_fakequant_export:
print(f"Exporting vLLM fakequant checkpoint (bf16 weights + amax) to: {export_path}")
export_hf_vllm_fq_checkpoint(full_model, export_dir=export_path)
else:
print(
f"qformat '{args.qformat}' is a plugin-registered config and is not exportable "
f"via the standard export pipeline. Saving with save_pretrained() instead."
)
full_model.save_pretrained(export_path)
if tokenizer is not None:
tokenizer.save_pretrained(export_path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This plugin-config export bypass skips the normal artifact cleanup.

By bypassing export_quantized(), this branch never restores the tokenizer's original padding settings and never copies custom model files/configs. That makes the saved artifact materially different from the standard export path, especially for trust_remote_code models.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/llm_ptq/hf_ptq.py` around lines 1088 - 1102, The plugin-config
export branch that uses full_model.save_pretrained() /
export_hf_vllm_fq_checkpoint (when args.qformat not in QUANT_CFG_CHOICES and
hasattr(mtq, args.qformat)) skips the post-export cleanup done by
export_quantized(): restore the tokenizer's original padding/token settings and
copy any custom model files/configs required for trust_remote_code models. Fix
by invoking the same post-export cleanup steps used by export_quantized() after
saving (or by calling a shared helper): restore the tokenizer's original
padding/pad_token/padding_side state and then copy over any custom/model config
files into export_path so the artifact matches the standard export path; ensure
this runs for both the export_hf_vllm_fq_checkpoint and
full_model.save_pretrained branches before tokenizer.save_pretrained.

Comment on lines +1324 to 1335
percdamp: float = ModeloptField(
default=0.01,
gt=0.0,
le=1.0,
title="Percentage damping factor.",
description="The percentage of average Hessian diagonal used for damping.",
)
block_size: int | None = ModeloptField(
block_size: int = ModeloptField(
default=128,
title="Block size for GPTQ weight update.",
description="""The block size for GPTQ weight update, which must be a multiple of the
group_size used in the quantization.""",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate block_size at the config boundary.

block_size now accepts 0 and negative values, so an invalid GPTQ config can get through schema validation and only fail later inside the blockwise update path. Please add at least gt=0 here, and ideally enforce the documented group-size multiple constraint wherever that information is available.

Proposed fix
     block_size: int = ModeloptField(
         default=128,
+        gt=0,
         title="Block size for GPTQ weight update.",
         description="""The block size for GPTQ weight update, which must be a multiple of the
         group_size used in the quantization.""",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/config.py` around lines 1324 - 1335, The
ModeloptField for block_size currently allows zero and negative values; add
validation by setting gt=0 on the block_size ModeloptField so config schema
rejects non-positive sizes, and where group_size is available (e.g., in the GPTQ
config validation path or an initializer that uses block_size and group_size)
add a check that block_size % group_size == 0 and raise a clear validation
error; update references around the block_size ModeloptField and any GPTQ config
validator that uses group_size to enforce this multiple constraint.

Comment on lines +1535 to 1545
# Flatten to 2D (total_tokens, features) first, so batch_size counts tokens
input_flat = input.reshape(-1, input.shape[-1]).t().float()
batch_size = input_flat.shape[1]

# Incremental averaging: scale down old hessian
hessian *= n_samples / (n_samples + batch_size)
n_samples += batch_size

# Compute outer product: H += (2/n_samples) * X @ X^T
# where X is the flattened input reshaped to (features, batch*seq)
input_flat = input.reshape(-1, input.shape[-1]).t().float()
scaled_input = math.sqrt(2 / n_samples) * input_flat
hessian.add_((scaled_input @ scaled_input.t()).to(hessian.device))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ignore empty activation batches before updating the Hessian.

After the reshape, batch_size can be 0 for empty expert/token batches. The divisions on Lines 1540 and 1544 then introduce NaNs/inf into hessian, and every later GPTQ update for that module is corrupted.

Proposed fix
     # Flatten to 2D (total_tokens, features) first, so batch_size counts tokens
     input_flat = input.reshape(-1, input.shape[-1]).t().float()
     batch_size = input_flat.shape[1]
+    if batch_size == 0:
+        return hessian, n_samples
 
     # Incremental averaging: scale down old hessian
     hessian *= n_samples / (n_samples + batch_size)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/model_calib.py` around lines 1535 - 1545, After
flattening to input_flat and computing batch_size, guard the Hessian update
against empty batches: if batch_size == 0, skip the incremental averaging and
outer-product steps entirely (do not modify n_samples or hessian). Specifically,
before the lines that scale down hessian (hessian *= n_samples / (n_samples +
batch_size)) and compute scaled_input and hessian.add_((scaled_input @
scaled_input.t())...), add a check on batch_size and return/continue from the
surrounding function/block so no division or sqrt by zero occurs and hessian
remains unchanged.

Comment on lines +1983 to +1994
gptq_handles = {name: GPTQHelper(m, name, offload_to_cpu=True) for name, m in quantized_layers}
for handle in gptq_handles.values():
handle.setup()

calib_func(layer, _layer_forward_loop, **calib_kwargs)
print_rank_0(f"Computing Hessians for {len(gptq_handles)} linear layers...")

del layer_inputs
torch.cuda.empty_cache()
finally:
input_getter._unpatch_all_layers()
with disabled_weight_quantizers(model):
forward_loop(model)

for handle in gptq_handles.values():
handle.cleanup()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Always unpatch GPTQ forwards in a finally block.

If forward_loop(model) raises here, every patched linear keeps the Hessian wrapper installed and the model is left in a partially mutated state for the rest of the process.

Proposed fix
     gptq_handles = {name: GPTQHelper(m, name, offload_to_cpu=True) for name, m in quantized_layers}
-    for handle in gptq_handles.values():
-        handle.setup()
-
-    print_rank_0(f"Computing Hessians for {len(gptq_handles)} linear layers...")
-
-    with disabled_weight_quantizers(model):
-        forward_loop(model)
-
-    for handle in gptq_handles.values():
-        handle.cleanup()
+    try:
+        for handle in gptq_handles.values():
+            handle.setup()
+
+        print_rank_0(f"Computing Hessians for {len(gptq_handles)} linear layers...")
+        with disabled_weight_quantizers(model):
+            forward_loop(model)
+    finally:
+        for handle in gptq_handles.values():
+            handle.cleanup()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/model_calib.py` around lines 1983 - 1994, The
code currently calls GPTQHelper.setup() for each handle then runs
forward_loop(model) but only calls GPTQHelper.cleanup() afterwards, so if
forward_loop raises the patched GPTQ forwards remain; wrap the forward_loop call
in a try/finally: after creating gptq_handles and calling handle.setup() on
each, enter the disabled_weight_quantizers(model) context and call
forward_loop(model) inside a try block, and in the finally iterate over
gptq_handles.values() and call handle.cleanup() to ensure GPTQHelper.cleanup()
always runs even on exceptions (referencing gptq_handles, GPTQHelper, setup,
cleanup, forward_loop, and disabled_weight_quantizers).

Comment on lines +369 to +410
@torch.no_grad()
def prepare_for_resume(
self,
resume_layer_idx: int,
forward_loop: ForwardLoop,
):
"""Set up layer states for resuming sequential calibration from a checkpoint.

Runs a single warm-up forward pass so that the next call to
:meth:`get_input_activations` for ``resume_layer_idx`` produces the
correct inputs. Layers ``0 .. K-2`` run in *original* mode (with
``capture_output_meta`` enabled so skip-mode metadata is populated),
layer ``K-1`` in *capture* mode. After the pass, ``0 .. K-2`` switch
to *skip* and ``K-1`` retains its ``collected_inputs`` for the
subsequent *run* transition.
"""
if not self._patched:
raise RuntimeError(
"prepare_for_resume() requires _patch_all_layers() to be called first."
)
if resume_layer_idx == 0:
return

k = resume_layer_idx
preceding = range(k - 1)

assert self._decoder_layers is not None
for i in preceding:
self._set_layer_mode(i, "original")
self._decoder_layers[i]._seq_calib.capture_output_meta = True

print_rank_0(
f"Running warm-up forward pass for resume "
f"(layers 0..{k - 2} original, layer {k - 1} capture)"
)
self._run_warmup_capture(k - 1, forward_loop)

for i in preceding:
self._decoder_layers[i]._seq_calib.capture_output_meta = False
self._set_layer_mode(i, "skip")

print_rank_0(f"Warm-up complete. Ready to resume from layer {k}.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Resume helper call sites =="
rg -nP --type=py '\b(detect_sequential_resume_layer|prepare_for_resume)\s*\('

echo
echo "== sequential_calibrate implementation =="
sed -n '1545,1615p' modelopt/torch/quantization/model_calib.py

Repository: NVIDIA/Model-Optimizer

Length of output: 3276


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Full sequential_calibrate function (lines 1545-1700) =="
wc -l modelopt/torch/quantization/model_calib.py
sed -n '1545,1700p' modelopt/torch/quantization/model_calib.py | head -200

echo
echo "== Search for any code calling sequential_calibrate =="
rg -nP --type=py -B2 -A2 'sequential_calibrate\s*\(' | head -50

echo
echo "== Search for imports or references to checkpoint resume in model_calib =="
rg -nP --type=py '(detect_sequential_resume_layer|prepare_for_resume|checkpoint|resume)' modelopt/torch/quantization/model_calib.py | head -20

Repository: NVIDIA/Model-Optimizer

Length of output: 10850


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check if any test explicitly tests resume flow =="
rg -nP --type=py '(detect_sequential_resume_layer|prepare_for_resume)' tests/

echo
echo "== Check the full activation_collector.py context around prepare_for_resume =="
rg -nP --type=py -B5 -A25 'def prepare_for_resume' modelopt/torch/quantization/utils/activation_collector.py

echo
echo "== Check if there's any documentation about resume flow =="
find . -name "*.md" -o -name "*.rst" | xargs grep -l "sequential.*resume\|resume.*sequential" 2>/dev/null | head -5

Repository: NVIDIA/Model-Optimizer

Length of output: 118


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check activation_collector.py context around prepare_for_resume =="
sed -n '360,420p' modelopt/torch/quantization/utils/activation_collector.py

echo
echo "== Check if prepare_for_resume or detect_sequential_resume_layer are imported anywhere =="
rg -nP --type=py 'from.*activation_collector.*import|from.*checkpoint.*import' | grep -E '(prepare_for_resume|detect_sequential_resume_layer)' | head -20

echo
echo "== Check all imports of checkpoint module =="
rg -nP --type=py 'import.*checkpoint|from.*checkpoint' modelopt/torch/quantization/

echo
echo "== List all public/exported functions in checkpoint.py =="
sed -n '1,100p' modelopt/torch/quantization/utils/checkpoint.py

Repository: NVIDIA/Model-Optimizer

Length of output: 2177


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check all calls to prepare_for_resume method (not just definition) =="
rg -nP --type=py 'prepare_for_resume\s*\(' --no-heading

echo
echo "== Check all calls to detect_sequential_resume_layer =="
rg -nP --type=py 'detect_sequential_resume_layer\s*\(' --no-heading

echo
echo "== Check if LayerActivationCollector is used outside of model_calib.py =="
rg -nP --type=py 'LayerActivationCollector' modelopt/torch/quantization/ | grep -v 'test_' | head -20

echo
echo "== Search for checkpoint module usage in sequential_calibrate context =="
rg -nP --type=py -B10 -A10 'checkpoint' modelopt/torch/quantization/model_calib.py | head -40

Repository: NVIDIA/Model-Optimizer

Length of output: 5458


prepare_for_resume() is not integrated into sequential calibration.

The new resume helpers (prepare_for_resume() and detect_sequential_resume_layer()) are defined but never called anywhere in the codebase. The main sequential_calibrate() function iterates through all layers from start to end without detecting, loading, or reconstructing resume state. No code path—tests included—exercises resume functionality. This means resuming from a checkpoint will not rebuild the skip/run/capture layer states needed to continue from layer K.

To complete resume support, integrate these helpers into sequential_calibrate():

  • Call detect_sequential_resume_layer() to check for a saved _seq_calib_progress checkpoint.
  • If resuming, call input_getter.prepare_for_resume(resume_layer_idx, forward_loop) before the main layer loop.
  • Adjust the loop to start from resume_layer_idx instead of 0.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/utils/activation_collector.py` around lines 369 -
410, sequential_calibrate currently never checks or uses resume helpers so
resume state (_seq_calib_progress) and layer modes aren't rebuilt; update
sequential_calibrate to call detect_sequential_resume_layer() at start, and if
it returns a resume_layer_idx call
input_getter.prepare_for_resume(resume_layer_idx, forward_loop) before entering
the main layer loop, then set the loop's start index to resume_layer_idx
(instead of 0) so subsequent iterations continue from the resumed layer; ensure
you reference detect_sequential_resume_layer, input_getter.prepare_for_resume,
forward_loop, resume_layer_idx and the existing _seq_calib_progress handling
when adding this logic.

Comment on lines +713 to +726
@contextmanager
def disabled_weight_quantizers(model: nn.Module):
"""Disable weight quantizers during hessian collection."""
disabled_modules = []
for module in model.modules():
if is_quantized_linear(module) and module.weight_quantizer.is_enabled:
module.weight_quantizer.disable()
disabled_modules.append(module)
try:
yield
finally:
for module in disabled_modules:
module.weight_quantizer.enable()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Disable every weight quantizer, not just module.weight_quantizer.

This helper ignores modules that use non-standard weight attributes like gate_up_proj / down_proj, even though weight_attr_names() in the same file explicitly supports them. In GPTQ that means Hessian collection can still run with some weight quantizers enabled, which skews the update for those supported module types.

Proposed fix
 `@contextmanager`
 def disabled_weight_quantizers(model: nn.Module):
     """Disable weight quantizers during hessian collection."""
-    disabled_modules = []
+    disabled_quantizers = []
     for module in model.modules():
-        if is_quantized_linear(module) and module.weight_quantizer.is_enabled:
-            module.weight_quantizer.disable()
-            disabled_modules.append(module)
+        for weight_name in weight_attr_names(module):
+            quantizer_name = quantizer_attr_names(weight_name).weight_quantizer
+            quantizer = getattr(module, quantizer_name, None)
+            if quantizer is not None and quantizer.is_enabled:
+                quantizer.disable()
+                disabled_quantizers.append(quantizer)
     try:
         yield
     finally:
-        for module in disabled_modules:
-            module.weight_quantizer.enable()
+        for quantizer in disabled_quantizers:
+            quantizer.enable()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/quantization/utils/core_utils.py` around lines 713 - 726, The
helper disabled_weight_quantizers only toggles module.weight_quantizer and
misses non-standard weight attributes; update disabled_weight_quantizers to
iterate weight names from weight_attr_names(module) for each module (in addition
to checking is_quantized_linear(module)), retrieve each attr (e.g.,
getattr(module, attr_name, None)), if it has an is_enabled flag call its
disable() and record (module, attr_name) so the finally block can re-enable by
calling enable() on the same attribute's quantizer; ensure you still check
existence and is_enabled before disabling and re-check before enabling to avoid
AttributeError.

Comment on lines +31 to +37
def _configs_available():
try:
import modelopt.torch.quantization as mtq

return getattr(mtq, RTN_CFG_NAME, None) is not None
except Exception:
return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Skip the test only when both presets are unavailable.

_configs_available() only checks RTN_CFG_NAME, but Line 57 also requires GPTQ_CFG_NAME. If RTN is registered and GPTQ is not, this test hard-fails instead of being skipped.

Proposed fix
 def _configs_available():
     try:
         import modelopt.torch.quantization as mtq
 
-        return getattr(mtq, RTN_CFG_NAME, None) is not None
+        return (
+            getattr(mtq, RTN_CFG_NAME, None) is not None
+            and getattr(mtq, GPTQ_CFG_NAME, None) is not None
+        )
     except Exception:
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/gpu/torch/quantization/test_gptq_vq.py` around lines 31 - 37, The
_configs_available() helper currently only checks for RTN_CFG_NAME and causes a
hard failure if GPTQ_CFG_NAME is missing; update _configs_available() to import
modelopt.torch.quantization and return True only when both getattr(mtq,
RTN_CFG_NAME, None) and getattr(mtq, GPTQ_CFG_NAME, None) are not None so the
test is skipped unless both presets (RTN_CFG_NAME and GPTQ_CFG_NAME) are
available; reference the existing function name _configs_available and the
symbols RTN_CFG_NAME and GPTQ_CFG_NAME when making the change.

Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants