[ET Device Support] MethodMeta: expose per-buffer device placement API#18474
[ET Device Support] MethodMeta: expose per-buffer device placement API#18474Gasoonjia wants to merge 3 commits intogh/gasoonjia/154/basefrom
Conversation
Add memory_planned_buffer_device(index) to MethodMeta, returning the
Device (type + index) for each planned memory buffer. This reads from
the non_const_buffer_device field in the serialized ExecutionPlan.
For CPU-only programs (or legacy PTE files without non_const_buffer_device),
all buffers default to Device{CPU, 0}. The sparse list only stores entries
for non-CPU buffers, so the lookup scans for a matching buffer_idx.
This API enables Module::load_method() to query each buffer's target device
and allocate accordingly (malloc for CPU, DeviceAllocator for CUDA, etc.).
Differential Revision: [D97850708](https://our.internmc.facebook.com/intern/diff/D97850708/)
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18474
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New Failures, 2 Unrelated FailuresAs of commit 793898f with merge base b5ae0b9 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…lacement API"
Add memory_planned_buffer_device(index) to MethodMeta, returning the
Device (type + index) for each planned memory buffer. This reads from
the non_const_buffer_device field in the serialized ExecutionPlan.
For CPU-only programs (or legacy PTE files without non_const_buffer_device),
all buffers default to Device{CPU, 0}. The sparse list only stores entries
for non-CPU buffers, so the lookup scans for a matching buffer_idx.
This API enables Module::load_method() to query each buffer's target device
and allocate accordingly (malloc for CPU, DeviceAllocator for CUDA, etc.).
Differential Revision: [D97850708](https://our.internmc.facebook.com/intern/diff/D97850708/)
[ghstack-poisoned]
…lacement API"
Add memory_planned_buffer_device(index) to MethodMeta, returning the
Device (type + index) for each planned memory buffer. This reads from
the non_const_buffer_device field in the serialized ExecutionPlan.
For CPU-only programs (or legacy PTE files without non_const_buffer_device),
all buffers default to Device{CPU, 0}. The sparse list only stores entries
for non-CPU buffers, so the lookup scans for a matching buffer_idx.
This API enables Module::load_method() to query each buffer's target device
and allocate accordingly (malloc for CPU, DeviceAllocator for CUDA, etc.).
Differential Revision: [D97850708](https://our.internmc.facebook.com/intern/diff/D97850708/)
[ghstack-poisoned]
Stack from ghstack (oldest at bottom):
Add memory_planned_buffer_device(index) to MethodMeta, returning the
Device (type + index) for each planned memory buffer. This reads from
the non_const_buffer_device field in the serialized ExecutionPlan.
For CPU-only programs (or legacy PTE files without non_const_buffer_device),
all buffers default to Device{CPU, 0}. The sparse list only stores entries
for non-CPU buffers, so the lookup scans for a matching buffer_idx.
This API enables Module::load_method() to query each buffer's target device
and allocate accordingly (malloc for CPU, DeviceAllocator for CUDA, etc.).
Differential Revision: D97850708