linstor: add LinstorDataMotionStrategy for live migration + storpool: fix qemu-img copy to NFS#1
Open
DennisKonrad wants to merge 2 commits intoLINBIT:linstor-backport-4.17.2.0from
Conversation
43b4d00 to
9af031e
Compare
Add a new DataMotionStrategy implementation that enables VM live migration
when the destination storage pool is Linstor (DRBD). Without this strategy,
CloudStack's storage migration framework had no code path to handle live
migrations *to* Linstor pools, leaving three scenarios broken:
- Linstor -> Linstor: blocked (no strategy claimed it)
- SMP -> Linstor: DEFECT (KvmNonManagedDMS claimed it but generated
wrong DiskType=FILE/DriverType=QCOW2 and an invalid
device path using the resource group name instead of
a DRBD block device path)
- StorPool -> Linstor: blocked (no strategy claimed it)
How strategy selection works:
CloudStack iterates all DataMotionStrategy beans and picks the one
returning the highest StrategyPriority from canHandle(). The existing
strategies return:
- StorPoolDMS: HIGHEST only when ALL dest pools are StorPool
- KvmNonManagedDMS: HYPERVISOR only for {NFS, SMP, Filesystem}
- StorageSystemDMS: only for managed (isManaged=true) pools
- AncientDMS: DEFAULT (fallback, copies via secondary storage)
LinstorDataMotionStrategy returns HIGHEST when ALL destination pools are
Linstor, giving it priority over KvmNonManagedDMS (HYPERVISOR=2) and
AncientDMS (DEFAULT=1), while not conflicting with StorPoolDMS.
canHandle semantics:
Offline (DataObject, DataObject):
Always returns CANT_HANDLE. Offline volume copies continue to use
existing paths (AncientDMS or driver canCopy). A native Linstor
offline copy (e.g. DRBD clone) can be added in a future commit.
Live (Map<VolumeInfo,DataStore>, Host, Host):
Returns HIGHEST when ALL destination DataStores are Linstor pools.
The source pools can be anything (Linstor, StorPool, SMP, NFS, ...),
enabling cross-storage live migration *to* Linstor.
Live migration flow (copyAsync with volumeMap):
For each volume in the migration set:
1. Create a destination VolumeVO record in the database
(duplicateVolumeOnAnotherStorage).
2. If cross-storage (src is not Linstor, or different Linstor controller):
create a new DRBD resource via the Linstor REST API
(resourceGroupSpawn on the destination pool's resource group).
3. Ensure the resource is available on the destination KVM host
(resourceMakeAvailableOnNode). For same-controller Linstor->Linstor,
DRBD already has the data replicated so this is a lightweight diskless
attach or no-op.
4. Set DRBD allow-two-primaries so both source and destination hosts can
have the device open read-write simultaneously during migration.
Uses ResourceDefinition-level properties when both nodes are diskless
(DRBD client topology), or ResourceConnection-level properties when
nodes are hyperconverged (have local disks).
5. Build MigrateDiskInfo with DiskType=BLOCK, DriverType=RAW, Source=DEV,
and destPath=/dev/drbd/by-res/<rscName>/0. This tells libvirt's
replaceStorage() to modify the VM's disk XML for block-copy migration.
6. Send PrepareForMigrationCommand to destination host, then
MigrateCommand (with migrateStorageManaged=true) to source host.
Libvirt performs the actual block copy using VIR_MIGRATE_NON_SHARED_DISK.
Post-migration success:
- Remove allow-two-primaries from all resources
- Swap volume UUIDs between source and destination (updateUuid)
- Destroy and expunge source volumes
- Update snapshot references to point to new volume IDs
Post-migration failure:
- Remove allow-two-primaries
- Delete destination Linstor resources unconditionally (not just diskless)
- Delete resource definitions if no resources remain
- Expunge destination volumes (DB records)
- Rollback PrepareForMigration on destination host
Error handling:
- On early failure (before MigrateCommand), handlePostMigration(false)
is called from the catch block to ensure Linstor resources are cleaned
up and not left orphaned.
- viewResources API errors are logged as warnings and creation is
attempted regardless (rather than silently assuming resource absence).
- applyAuxProps errors are logged but non-fatal.
Spring context:
Register the LinstorDataMotionStrategy bean in
spring-storage-volume-linstor-context.xml so StorageStrategyFactoryImpl
discovers it via auto-wiring.
Files:
- NEW: plugins/storage/volume/linstor/src/main/java/org/apache/
cloudstack/storage/motion/LinstorDataMotionStrategy.java
- MOD: plugins/storage/volume/linstor/src/main/resources/META-INF/
cloudstack/storage-volume-linstor/
spring-storage-volume-linstor-context.xml
…rage
Fix qemu-img convert failures when copying StorPool volumes to secondary
storage (NFS) during offline volume migration.
Symptom:
Offline migration of a StorPool volume to another primary storage (e.g.
Linstor) fails with:
qemu-img: error while writing sector 4202495: Invalid argument
This happens in StorPoolCopyVolumeToSecondaryCommand which creates a
temporary StorPool snapshot, attaches it as a block device, and copies
it via qemu-img convert to a file on NFS secondary storage.
Root cause:
The basic QemuImg(timeout) constructor creates qemu-img convert commands
without any I/O mode flags. On certain NFS configurations, buffered I/O
can cause EINVAL errors at specific sector boundaries when writing large
volumes from a raw block device source.
Fix:
Use the 3-parameter constructor QemuImg(timeout, skipZero=false, noCache=true):
- skipZero=false: Do NOT enable --target-is-zero. This flag is only safe
when the target device is guaranteed pre-zeroed (e.g. thin-provisioned
block devices like LVM_THIN or ZFS_THIN). NFS files are NOT pre-zeroed,
so enabling this flag would cause silent data corruption by skipping
zero-filled sectors that the target still contains stale data for.
The Linstor storage adaptor handles this correctly by checking
LinstorUtil.resourceSupportZeroBlocks() before enabling skipZero.
- noCache=true: Enable direct I/O (-t none) which bypasses the kernel
page cache. This ensures writes are flushed directly to the NFS server,
avoiding cache-related EINVAL errors at sector boundaries and improving
reliability for large volume copies.
Impact:
Only affects the StorPoolCopyVolumeToSecondaryCommandWrapper code path,
which is used during offline volume migration when StorPool is the source
and the copy goes through secondary (NFS) storage. StorPool-to-StorPool
copies use native StorPool cloning and are not affected.
Files:
- MOD: plugins/storage/volume/storpool/src/main/java/com/cloud/hypervisor/
kvm/resource/wrapper/StorPoolCopyVolumeToSecondaryCommandWrapper.java
9af031e to
7acac7a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two fixes for VM migration with Linstor and StorPool storage:
DataMotionStrategyenabling VM live migration to Linstor poolsCommit 1: LinstorDataMotionStrategy
Problem
CloudStack had no
DataMotionStrategyclaiming live migrations when the destination is a Linstor pool. Three scenarios were broken:KvmNonManagedDMSclaimed it but generated wrong DiskType=FILE, DriverType=QCOW2, and used the resource group name as device path instead of/dev/drbd/by-res/...Solution
New
LinstorDataMotionStrategybean that returnsHIGHESTpriority when all destination pools are Linstor. Works for any source storage type (Linstor, StorPool, NFS, SMP, ...).Live migration flow per volume:
VolumeVOin DBresourceGroupSpawnresourceMakeAvailableOnNode)allow-two-primaries(ResourceDefinition or ResourceConnection level, depending on topology)MigrateDiskInfowithDiskType=BLOCK,DriverType=RAW,destPath=/dev/drbd/by-res/<rscName>/0PrepareForMigrationCommand+MigrateCommand(withmigrateStorageManaged=true)Post-migration: UUID swap, source cleanup, allow-two-primaries removal. On failure: full rollback of Linstor resources and DB records.
Offline migration: Returns
CANT_HANDLE— existing paths (AncientDMS) continue to work. Native Linstor offline copy can be added later.Files
NEWplugins/storage/volume/linstor/.../motion/LinstorDataMotionStrategy.javaMODplugins/storage/volume/linstor/.../spring-storage-volume-linstor-context.xml(bean registration)Commit 2: StorPool qemu-img direct I/O
Problem
Offline migration of StorPool volumes to other storage (e.g. Linstor) fails:
This happens in
StorPoolCopyVolumeToSecondaryCommandWrapperwhen copying a StorPool snapshot to NFS secondary storage viaqemu-img convert.Solution
Changed
new QemuImg(timeout)tonew QemuImg(timeout, false, true):skipZero=false: NFS target files are NOT pre-zeroed. Enabling--target-is-zerowould cause silent data corruption by skipping zero-filled sectors.noCache=true: Enables direct I/O (-t none), bypassing kernel page cache. Fixes EINVAL errors on certain NFS configurations.Files
MODplugins/storage/volume/storpool/.../wrapper/StorPoolCopyVolumeToSecondaryCommandWrapper.java