Add GPUArraysCore extension: broadcast-based seed! for GPU dual arrays#816
Draft
ChrisRackauckas-Claude wants to merge 1 commit into
Draft
Conversation
The four `seed!` methods in src/apiutils.jl write each dual with a scalar
`setindex!` loop over `structural_eachindex`, which errors with "Scalar
indexing is disallowed" on GPU arrays. ForwardDiff 0.10 seeded with broadcast
and worked on GPU arrays, so jacobians on GPU arrays regressed in the 1.0
rewrite.
Add a GPUArraysCore package extension that overrides `seed!` for
`AbstractGPUArray{<:Dual}` with broadcast-based seeding, restoring the pre-1.0
behavior. GPU arrays are dense, one-based, and isbits-valued, so the
structural-index / unset-element handling of the generic methods is not needed
on this path.
Tested via JLArrays (which emulates the GPU scalar-indexing ban on the CPU) so
the extension is covered in CI without a physical GPU.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Member
|
Because the broadcasts were purposefully removed due to performance things, we should at least recover the GPU functionality by handling AbstractGPUArray. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #816 +/- ##
==========================================
+ Coverage 90.74% 90.88% +0.13%
==========================================
Files 11 12 +1
Lines 1070 1086 +16
==========================================
+ Hits 971 987 +16
Misses 99 99 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ForwardDiff 1.x rewrote the four
seed!methods insrc/apiutils.jlto write eachdual with a scalar
setindex!loop overstructural_eachindex:On GPU arrays this triggers
ERROR: Scalar indexing is disallowed, so everyForwardDiff.jacobian/jacobian!call on a GPU array (e.g.CuArray) errorsin
seed!. ForwardDiff 0.10 seeded with broadcast and worked on GPU arrays, sothis is a regression for GPU usage introduced in the 1.0 rewrite.
This was first hit downstream in SciML/ComplementaritySolve.jl#65, where the GPU
solver paths (NonlinearSolve's
AutoForwardDiffjacobians) broke once theenvironment resolved ForwardDiff ≥ 1.
Fix
Add a
GPUArraysCorepackage extension (ForwardDiffGPUArraysCoreExt) thatoverrides the four
seed!methods forAbstractGPUArray{<:Dual}withbroadcast-based seeding, restoring the pre-1.0 behavior. GPU arrays are always
dense, one-based, and carry isbits element types, so the
structural_eachindex/ unset-element handling of the generic methods isunnecessary on this path and plain broadcast suffices.
This is a weak dependency, so it adds nothing for users who don't load a GPU
array package.
Scope
This covers the jacobian paths (vector and chunk mode), which is what
seed!feeds.
ForwardDiff.gradienton GPU arrays has a second scalar-indexing site inthe gradient extraction path (
extract_gradient_chunk!) that this PR does nottouch; fixing
seed!is the necessary first step and resolves the jacobianregression.
Tests
test/GPUArraysCoreTest.jlexercises all fourseed!methods throughjacobian,jacobian!, and chunked configs usingJLArrays, which emulatesGPU array semantics (including the scalar-indexing ban) on the CPU — so the
extension is covered in CI without a physical GPU.
Note
Opened as a draft by an agent on behalf of @ChrisRackauckas. Please ignore
until reviewed by @ChrisRackauckas.
🤖 Generated with Claude Code