-
Notifications
You must be signed in to change notification settings - Fork 620
[Common] Disabled the tuned NVFP4 kernels #2615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Common] Disabled the tuned NVFP4 kernels #2615
Conversation
Signed-off-by: Oleg Goncharov <[email protected]>
|
/te-ci |
Greptile SummaryThis PR temporarily disables the tuned NVFP4 quantization kernel that was introduced in PR #2412 due to numeric mismatches. The changes are surgical and effective:
The disable mechanism is clean and reversible - the tuned kernel code remains in the codebase (in Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Test as Test Suite
participant QT as quantize_transpose()
participant Tuned as quantize_transpose_tuned_1D()<br/>(DISABLED)
participant Standard as Standard NVFP4 Kernel
Note over Test: use_fast_math = false (hardcoded)
Test->>QT: Call with use_fast_math=false
Note over QT: Check conditions:<br/>!use_2d_quantization &&<br/>input.dtype() == kBFloat16
Note over QT,Tuned: Conditional block COMMENTED OUT
rect rgb(220, 220, 220)
Note over Tuned: Would call tuned kernel<br/>(NOW DISABLED)
end
QT->>Standard: Falls through to standard kernel
Standard-->>QT: Returns quantized output
QT-->>Test: Returns result
Note over Test,Standard: Tuned kernel disabled to avoid<br/>numeric mismatches until fixed
|
Signed-off-by: Oleg Goncharov <[email protected]>
|
/te-ci |
* Disabled the tuned NVFP4 kernels Signed-off-by: Oleg Goncharov <[email protected]> * Disabled fast math in cpp tests Signed-off-by: Oleg Goncharov <[email protected]> --------- Signed-off-by: Oleg Goncharov <[email protected]>
Description
This PR disables the previously introduced tuned NVFP4 kernels (PR#2412 [Common] Tuned NVFP4 cast kernel) because it produces small numeric mismatches. It will be re-enabled once the issue is fixed.
Type of change
Changes
Checklist: