HDDS-15290. DiskBalancer should write info file atomically.#10284
HDDS-15290. DiskBalancer should write info file atomically.#10284slfan1989 wants to merge 5 commits into
Conversation
adoroszlai
left a comment
There was a problem hiding this comment.
Thanks @slfan1989 for the patch.
| final Path tempFile; | ||
| try { | ||
| tempFile = Files.createTempFile(parent, path.getName(), ".tmp"); | ||
| } catch (IOException e) { | ||
| throw new IOException( | ||
| "Unable to create temporary DiskBalancerInfo file under: " | ||
| + parent, e); | ||
| } | ||
| boolean moved = false; | ||
| try { | ||
| writer.write(diskBalancerInfo, tempFile.toFile()); | ||
| try { | ||
| Files.move(tempFile, target, StandardCopyOption.ATOMIC_MOVE, | ||
| StandardCopyOption.REPLACE_EXISTING); | ||
| } catch (IOException e) { | ||
| throw new IOException( | ||
| "Unable to overwrite the DiskBalancerInfo file: " + target, e); | ||
| } |
There was a problem hiding this comment.
Temp file and atomic move to final location is not needed, because YamlUtils uses AtomicFileOutputStream internally, which performs the same.
There was a problem hiding this comment.
Updated. Removed the redundant temp-file and atomic move wrapper. DiskBalancerService now only creates the parent directory and relies on DiskBalancerYaml/YamlUtils, which already uses AtomicFileOutputStream for atomic writes.
| assertTrue(exception.getMessage() | ||
| .startsWith("Unable to overwrite the DiskBalancerInfo file: ")); |
There was a problem hiding this comment.
nit: please replace with assertThat(exception).hasMessageStartingWith(...) (see HDDS-9951)
There was a problem hiding this comment.
Updated. Replaced the message prefix assertions with assertThat(exception).hasMessageStartingWith(...).
| } | ||
|
|
||
| @VisibleForTesting | ||
| static void writeDiskBalancerInfoFile(DiskBalancerInfo diskBalancerInfo, |
There was a problem hiding this comment.
- write the new DiskBalancerInfo to a temporary file in the same directory;
- atomically move the temporary file to diskBalancer.info;
Curious where the two atomic write steps are implemented in this change?
There was a problem hiding this comment.
The two steps are implemented by AtomicFileOutputStream used in YamlUtils.dump():
DiskBalancerService.writeDiskBalancerInfoFile()
-> DiskBalancerYaml.createDiskBalancerInfoFile()
-> YamlUtils.dump()
-> new AtomicFileOutputStream(file)
So DiskBalancerService no longer creates the temporary file or moves it directly. It relies on the existing YamlUtils atomic write path and avoids the previous non-atomic delete/createNewFile before writing.
There was a problem hiding this comment.
so this pr is only for improving the exception handling for disk balancer info file atomic write?
There was a problem hiding this comment.
Not only exception handling. The main purpose is to remove the previous non-atomic delete/createNewFile step before writing diskBalancer.info.
The actual atomic write is already handled by DiskBalancerYaml -> YamlUtils.dump() -> AtomicFileOutputStream. This PR keeps that path, creates the parent directory when needed, and wraps directory/write failures with clearer error messages.
There was a problem hiding this comment.
I think the important part is removal of explicit delete and createNewFile from here:
There was a problem hiding this comment.
oh, I got it, then how about just remove that branch and add some exception handling around those called functions that might throw exception? looks like we dont need to add an additional static func for this?
There was a problem hiding this comment.
That makes sense.
Since YamlUtils already provides the atomic write behavior, a separate static helper is not strictly needed.
I will simplify the change by keeping the logic inside writeDiskBalancerInfoTo(), remove the extra static method, and only add parent directory creation plus clearer exception wrapping around the existing DiskBalancerYaml.createDiskBalancerInfoFile() call.
…ozone/container/diskbalancer/DiskBalancerService.java Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
|
@Gargi-jais11 would you like to take a look? |
@adoroszlai Yes will take a look. Give me sometime. |
What changes were proposed in this pull request?
This PR updates DiskBalancer to persist
diskBalancer.infoatomically.Previously, DiskBalancer overwrote
diskBalancer.infoby deleting the existing file and creating a new one. If the write failed between these steps, the existing configuration could be lost or left incomplete.This patch changes the persistence flow to:
DiskBalancerInfoto a temporary file in the same directory;diskBalancer.info;IOExceptionmessages for directory creation, temporary file creation, and overwrite failures.This makes DiskBalancer info persistence more robust and avoids losing the previous valid configuration on write failure.
What is the link to the Apache JIRA
JIRA: HDDS-15290. DiskBalancer should write info file atomically.
How was this patch tested?
Added unit tests.