[fix](fe) Prune empty and cascading CTE plans#62828
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
OpenCode automated review failed and did not complete. Error: Review step was failure (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
I found 2 blocking issues.
Critical checkpoint conclusions:
- Goal of this PR: partially accomplished. The change does address empty/cascading CTE pruning in the optimizer path, and the new regression suite covers two representative plan shapes, but one fixpoint bug in the new normalization loop means cascading cleanup can still stop too early.
- Scope/minimality: mostly focused, but the
SessionVariableedit also removes a legacy variable alias unrelated to the optimizer fix. - Concurrency: no concurrency or locking concerns are introduced in these touched FE optimizer/session-variable paths.
- Lifecycle/static-init: no special lifecycle or static initialization issues are involved here.
- Configuration items: no new config was added, but an existing session variable declaration was changed in a user-visible way.
- Compatibility: not preserved. Removing
cbo_cte_inline_modebreaks existingSET/SHOW VARIABLESclients that still use the old alias. - Parallel code paths: the new optimizer-side normalization path is distinct from the main rewrite path, and its fixpoint detection currently relies on shallow
equals()semantics that do not reflect subtree changes. - Special conditions/checks: the new
currentPlan.equals(normalizedPlan)stop condition is not strong enough for Nereids plans. - Test coverage: improved for empty/cascade pruning, but still missing coverage for the shallow-equality fixpoint case and for preserving the legacy session-variable alias.
- Test result updates: one new regression test file was added; I did not run the test suite locally in this runner.
- Observability: no additional observability appears necessary for this change.
- Transaction/persistence/data-write/FE-BE variable passing: not applicable for this PR.
- Performance: no blocking performance issue identified beyond the correctness problems above.
- Other issues: none beyond the two findings below.
User focus: /tmp/review_focus.txt contains no additional focus items, so there were no extra focus-specific findings to report.
### What problem does this PR solve? Issue Number: None Related PR: None Problem Summary: Fix optimizer-side CTE pruning so empty-relation producers, zero-consumer anchors, and cascading inline opportunities are normalized to a fixpoint before memoization, and add regression coverage for empty and cascading CTE elimination. ### Release note None ### Check List (For Author) - Test: No need to test (user requested to skip compile and test) - Behavior changed: Yes (empty and cascade-pruned CTEs are normalized before memoization) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: CTE normalization used Plan.equals() as the fixpoint check. Some logical plan nodes do not compare children in equals(), so CTE rewrites under a kept parent node could be missed and cascading inline/prune normalization could stop too early.
### Release note
Fix cascading CTE pruning/inline normalization when child CTE changes occur under logical nodes whose equals() does not compare children.
### Check List (For Author)
- Test: Regression test / FE checkstyle
- ./run-regression-test.sh --run -d query_p0/cte -s test_cbo_cte_inline_prune
- cd fe && mvn checkstyle:check -pl fe-core
- Behavior changed: Yes. CTE normalization now uses explicit rewrite state instead of shallow Plan.equals() for fixpoint detection.
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
run buildall |
There was a problem hiding this comment.
I did not add duplicate inline comments because the remaining blocking issue is already covered by the existing review thread on SessionVariable.java: removing the cbo_cte_inline_mode alias is a user-visible compatibility break and should be restored or intentionally documented with a migration plan. The earlier Plan.equals() fixpoint issue appears addressed by the explicit changed flag in the current head.
Critical checkpoint conclusions:
- Goal/test: The PR targets CTE pruning and cascading inline normalization, with regression coverage added for empty and cascading CTE cases.
- Scope: The optimizer changes are focused, but the alias removal is unrelated to the optimizer fix and should not be included.
- Concurrency/lifecycle: No new concurrent mutable state, locks, or special lifecycle hazards identified.
- Configuration/compatibility: Compatibility issue remains for the removed session-variable alias.
- Parallel code paths: The new normalization is applied to the base rewrite plan and CBO inline alternatives; no distinct missing path found.
- Testing: Regression coverage exists for the main CTE pruning behavior; no additional user focus points were provided.
- Observability/persistence/data writes: Not applicable for this optimizer-only change.
- Performance: No blocking performance issue found in the changed code; the normalization loop is bounded by CTE removal/inlining progress.
Focus response: review_focus.txt had no additional focus points.
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary: CTE normalization now inlines the single-consumer CTE in the repeat output slot shape case, so the expected shape output needed to be updated while query results stay unchanged.
### Release note
None
### Check List (For Author)
- Test: Regression test
- ./run-regression-test.sh --run -d query_p0/repeat -s test_repeat_output_slot
- Behavior changed: No. Test expectation only.
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
run buildall |
1 similar comment
|
run buildall |
TPC-H: Total hot run time: 29234 ms |
TPC-H: Total hot run time: 29546 ms |
TPC-DS: Total hot run time: 170284 ms |
TPC-DS: Total hot run time: 170993 ms |
FE UT Coverage ReportIncrement line coverage |
1 similar comment
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29539 ms |
TPC-DS: Total hot run time: 171356 ms |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29596 ms |
TPC-DS: Total hot run time: 170170 ms |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29589 ms |
TPC-DS: Total hot run time: 170848 ms |
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: None
Related PR: #60601
Problem Summary: Fix optimizer-side CTE pruning so empty-relation producers, zero-consumer anchors, and cascading inline opportunities are normalized to a fixpoint before memoization, and add regression coverage for empty and cascading CTE elimination.
Release note
None
Check List (For Author)
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)