HDDS-15300. Handle zero-capacity volumes in DiskBalancer policy.#10294
HDDS-15300. Handle zero-capacity volumes in DiskBalancer policy.#10294slfan1989 wants to merge 2 commits into
Conversation
Gargi-jais11
left a comment
There was a problem hiding this comment.
Thanks @slfan1989 for the fix.
Please do changes for calculateVolumeDataDensity — zero-capacity volumes
Filtering to getUsage().getCapacity() > 0 before calling getIdealUsage / getUtilization() is the right fix: VolumeFixedUsage leaves utilization == null when capacity is 0, so the previous loop could only “handle” that path by throwing and falling through to catch, which turned a common edge case into -1.0 and hid the real issue.
public static double calculateVolumeDataDensity(List<VolumeFixedUsage> volumeSet) {
if (volumeSet == null) {
LOG.warn("VolumeSet is null, returning 0.0 for VolumeDataDensity");
return 0.0;
}
try {
final List<VolumeFixedUsage> usableVolumes = volumeSet.stream()
.filter(v -> v.getUsage().getCapacity() > 0)
.collect(Collectors.toList());
if (usableVolumes.size() <= 1) {
return 0.0;
}
// Calculate ideal usage using the same immutable volume snapshot
final double idealUsage = getIdealUsage(usableVolumes);
double volumeDensitySum = 0.0;
// Calculate density for each volume using the same snapshot
for (VolumeFixedUsage volumeUsage : usableVolumes) {
final double currentUsage = volumeUsage.getUtilization();
// Calculate density as absolute difference from ideal usage
double volumeDensity = Math.abs(currentUsage - idealUsage);
volumeDensitySum += volumeDensity;
}
return volumeDensitySum;
} catch (Exception e) {
LOG.error("Error calculating VolumeDataDensity", e);
return -1.0;
}
}
With this change everywhere for diskbalancer if a volume is of 0 capacity that will be excluded.
@Gargi-jais11 Thanks for the suggestion! Updated the test to compare the density with and without the zero-capacity volume, which makes the expected behavior clearer and avoids the hard-coded density value. |
Gargi-jais11
left a comment
There was a problem hiding this comment.
+1, Last patch LGTM!
|
@ChenSammi Do you wish to take a look? |
What changes were proposed in this pull request?
This PR makes DiskBalancer skip zero-capacity volumes during volume candidate selection.
DefaultContainerChoosingPolicysorts volumes byVolumeFixedUsage#getUtilization. However,VolumeFixedUsagestores utilization asnullwhen the volume capacity is 0. If such a volume is included in the volume set, sorting can trigger aNullPointerExceptionbefore DiskBalancer chooses source/destination volumes.This change filters out volumes whose capacity is less than or equal to 0 before sorting by utilization. If fewer than two usable volumes remain after filtering, the policy returns no candidate instead of failing.
VolumeFixedUsagesets utilization to null.ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerVolumeCalculation.java
Lines 168 to 172 in 7b82918
This means:
capacity > 0, utilization is calculated;capacity == 0, utilization is not calculated and remains null.ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerVolumeCalculation.java
Lines 187 to 189 in 7b82918
So if utilization is null, calling this method throws:
ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/policy/DefaultContainerChoosingPolicy.java
Lines 84 to 88 in 7b82918
During sorting, Java calls
getUtilization()for eachVolumeFixedUsage. If any volume hascapacity == 0, itsutilizationis null, and sorting fails with an NPE before DiskBalancer can choose source/destination volumes.What is the link to the Apache JIRA
JIRA: HDDS-15300. Handle zero-capacity volumes in DiskBalancer policy.
How was this patch tested?
Add Junit Test
CI: https://github.com/slfan1989/ozone/actions/runs/25983890878