Skip to content

feat(cuda): accept a pre-linked CUBIN in load_module_bytes#7

Open
haixuanTao wants to merge 1 commit into
dimforge:mainfrom
haixuanTao:upstream/cuda-load-cubin
Open

feat(cuda): accept a pre-linked CUBIN in load_module_bytes#7
haixuanTao wants to merge 1 commit into
dimforge:mainfrom
haixuanTao:upstream/cuda-load-cubin

Conversation

@haixuanTao

Copy link
Copy Markdown

Summary

load_module_bytes assumed the input was PTX text. This makes it also accept a pre-linked CUBIN, detected by the ELF magic (\x7fELF) and loaded via cudarc::nvrtc::Ptx::from_binary. PTX text continues to load exactly as before.

Why

A cubin is required when a module references symbols the CUDA driver JIT cannot resolve on its own — most commonly libdevice __nv_* math (expf, fmaf, …). In that case a toolchain links libdevice into a self-contained cubin ahead of time, and the backend must load that binary rather than re-JIT the PTX. cudarc's load_module already dispatches on the kind, so the change is just selecting from_binary vs from_src.

Small, backwards-compatible (text PTX path is untouched).

🤖 Generated with Claude Code

`load_module_bytes` assumed PTX text. Also accept a CUBIN, detected by the
ELF magic (`\x7fELF`), loaded via `Ptx::from_binary`. This is required for
modules that reference symbols the driver JIT cannot resolve on its own —
e.g. libdevice `__nv_*` math — which a toolchain links into a self-contained
binary ahead of time. PTX text continues to load unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant