[fix](load job) rebuild broker load storage properties after Gson replay#63094
Open
sollhui wants to merge 1 commit intoapache:masterfrom
Open
[fix](load job) rebuild broker load storage properties after Gson replay#63094sollhui wants to merge 1 commit intoapache:masterfrom
sollhui wants to merge 1 commit intoapache:masterfrom
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 29598 ms |
Contributor
TPC-DS: Total hot run time: 171632 ms |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
|
/review |
Contributor
There was a problem hiding this comment.
No blocking issues found in this review.
Critical checkpoint conclusions:
- Goal and proof: The PR targets rebuilding
StorageDesc.storagePropertiesafter Gson replay for broker load descriptors. The addedGsonPostProcessablehook plus lazy getter initialization addresses the replay path, and the new unit test covers directBrokerDescreplay andBrokerLoadJobreplay with S3-backed properties. - Scope: The change is small and focused on reconstructing derived storage state from already persisted fields.
- Concurrency/lifecycle: No new concurrency or locking behavior is introduced. Lifecycle impact is limited to Gson deserialization and lazy access after replay.
- Persistence/edit log: No new persisted field is added. The reconstruction derives state from existing persisted
name,storageType, andproperties, so image/edit-log compatibility is preserved. - Compatibility: Existing serialized jobs should still deserialize because the new logic tolerates null properties and preserves broker-vs-refactored storage behavior for the reviewed paths.
- Error handling: Reconstruction uses existing
StorageProperties/BrokerPropertiesvalidation behavior; no swallowed failure path was added. - Performance: Only one-time lazy reconstruction is added, guarded by
storageProperties != null; no hot-path regression is apparent. - Observability: No additional logging appears necessary for this narrow replay fix.
- Test coverage: The new test is relevant to the bug. I attempted
./run-fe-ut.sh --run org.apache.doris.analysis.StorageDescPersistTest, but the runner failed during generated-code setup becausethirdparty/installed/bin/protoccould not be executed bymake, before the test ran.
User review focus: No additional user-provided review focus was supplied.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Problem Summary:
Broker load jobs may fail after FE restart, image load, or edit log replay in cloud mode.
StorageDesc.storagePropertiesis not persisted by Doris's Gson configuration, so after deserializationthe
BrokerDescinsideBrokerLoadJobcan keepstorageTypeand rawpropertiesbut lose thederived
storagePropertiesobject. When the pending task later tries to create a filesystem and listsource files, FE cannot reconstruct the expected storage backend state and the load may fail with
errors such as
Unknown storage type.This PR fixes the issue by rebuilding
storagePropertiesfromstorageType,name, andpropertiesafter Gson replay, and also lazily reinitializing it on access as a safety net.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)