[refine](column) enforce nullable nested types for array #63088
[refine](column) enforce nullable nested types for array #63088Mryange wants to merge 3 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Automated code review summary:
No blocking issues found in this PR.
Critical checkpoint conclusions:
- Goal and tests: The PR makes array element types explicitly nullable in BE and aligns FE array type modeling with that invariant. The changed BE serde unit test covers one Arrow nested-array path, and FE tests were adjusted for array nullability matching. I did not run the test suite in this review runner.
- Scope: The change is focused on array nested nullability and related FE/BE type conversion call sites.
- Concurrency and lifecycle: No new concurrent state, locks, threads, or non-trivial lifecycle/static initialization concerns were introduced.
- Configuration and compatibility: No new configuration items. Existing serialized FE metadata keeps the
containsNullfield for Gson compatibility while treating it as true at runtime. BE thrift/protobuf type descriptors continue to emit nullable array elements. - Parallel paths: I checked FE parser/type conversion/function signatures, BE DataTypeArray/DataTypeArraySerDe, and representative BE factory/consumer paths; the nullable nested type invariant is applied consistently in the modified paths.
- Error handling and invariants: The new
DORIS_CHECKassertions are used for the intended array nested-nullable invariant rather than silently accepting inconsistent columns/serdes. - Data correctness and persistence: No transaction/version/delete-bitmap/write path changes were introduced.
- Memory/performance: The change adds no significant allocations beyond normal nullable type wrapping and does not alter hot data loops except invariant checks on deserialize entry points.
- Observability: No additional observability appears necessary for this type-system refinement.
- User focus: No additional user-provided review focus was present.
|
run buildall |
TPC-H: Total hot run time: 29557 ms |
TPC-DS: Total hot run time: 170984 ms |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29429 ms |
TPC-DS: Total hot run time: 171726 ms |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29622 ms |
TPC-DS: Total hot run time: 172004 ms |
|
run buildall |
TPC-H: Total hot run time: 29687 ms |
TPC-DS: Total hot run time: 171519 ms |
|
run beut |
FE UT Coverage ReportIncrement line coverage |
1c1bdc1 to
88ca38a
Compare
|
run buildall |
TPC-H: Total hot run time: 29725 ms |
TPC-DS: Total hot run time: 170615 ms |
FE UT Coverage ReportIncrement line coverage |
|
run beut |
|
run buildall |
TPC-H: Total hot run time: 29817 ms |
TPC-DS: Total hot run time: 170923 ms |
|
run buildall |
TPC-H: Total hot run time: 29652 ms |
TPC-DS: Total hot run time: 172120 ms |
FE UT Coverage ReportIncrement line coverage |
|
run p0 |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
This PR makes the nested types inside Array explicitly nullable in BE type implementations, instead of relying on implicit caller-side conventions.
DataTypeArray now always stores nullable nested element type
DataTypeArraySerDe updated to follow the same invariant
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)