Skip to content

Fix diffusion dask OOM: pass scalar diffusivity directly to chunks#1117

Merged
brendancol merged 4 commits intomasterfrom
issue-1116
Mar 31, 2026
Merged

Fix diffusion dask OOM: pass scalar diffusivity directly to chunks#1117
brendancol merged 4 commits intomasterfrom
issue-1116

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

@brendancol brendancol commented Mar 31, 2026

Summary

  • For scalar diffusivity, the dask chunk function now receives the float value directly instead of a full-raster numpy array captured in every task closure
  • For DataArray diffusivity, the dask path passes the dask array as a second argument to map_overlap so each chunk gets only its own slice
  • Eliminates np.full(agg.shape, ...) allocation (line 261) and diffusivity.values materialization (line 255) that caused OOM on large dask inputs

Context

Found during performance sweep triage (#1116). On 512x512: dask used 35 MB vs numpy's 27 MB, with the 8 MB gap from the full-raster alpha_arr. At 30TB this allocation alone would OOM before any diffusion work starts.

Test plan

  • All 15 existing diffusion tests pass (verified)
  • test_dask_matches_numpy confirms dask scalar path matches numpy output

Parallel subagent triage + ralph-loop workflow for auditing all
xrspatial modules for performance bottlenecks, OOM risk under
30TB dask workloads, and backend-specific anti-patterns.
7 tasks covering command scaffold, module scoring, parallel subagent
dispatch, report merging, ralph-loop generation, and smoke tests.
For scalar diffusivity, the dask chunk function now receives the float
value directly instead of a full-raster numpy array captured in every
task closure. This eliminates the O(H*W) eager allocation and the
per-task serialization overhead.

For DataArray diffusivity, the dask path passes the dask array as a
second argument to map_overlap so each chunk gets only its own slice.
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Mar 31, 2026
@brendancol brendancol merged commit 74a6da9 into master Mar 31, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant