Skip to content

Support full outer join#10947

Open
windtalker wants to merge 2 commits into
pingcap:masterfrom
windtalker:support_full_outer_join
Open

Support full outer join#10947
windtalker wants to merge 2 commits into
pingcap:masterfrom
windtalker:support_full_outer_join

Conversation

@windtalker

@windtalker windtalker commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #10777

Problem Summary:

Support FULL OUTER JOIN pushdown to TiFlash for equi-join cases.

What is changed and how it works?

Support full outer join
Translate full outer join note to English

This PR adds TiFlash support for FULL OUTER JOIN in the hash join path with non-empty equi join keys. It wires the DAG join type mapping, nullable schema handling, condition validation, and execution paths needed by FULL OUTER JOIN.

Key changes:

  • Add FULL OUTER JOIN mapping from TiDB DAG join type to ASTTableJoin::Kind::Full.
  • Make both left-side and right-side output columns nullable for FULL OUTER JOIN.
  • Allow FULL OUTER JOIN to carry left/right conditions.
  • Fix FULL OUTER JOIN + other condition correctness by using row-flagged map behavior so build-side used marks are applied only after other condition passes.
  • Add targeted gtests and an English implementation note.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Support FULL OUTER JOIN pushdown to TiFlash for equi-join cases.

Summary by CodeRabbit

  • New Features

    • Added support for full outer joins in query processing, including correct handling of unmatched rows and join-side nullability.
    • Full outer joins now work with additional join conditions and are rejected when used without join keys.
  • Bug Fixes

    • Fixed full outer join behavior so rows are preserved correctly during probe and post-processing.
    • Improved join output and schema handling for nullable columns across full outer join paths.
  • Tests

    • Added coverage for full outer join execution, spill behavior, schema nullability, and validation cases.

close pingcap#10777

Support full outer join

- support full outer join protocol / join kind plumbing
- guard unsupported cartesian full outer join cases
- make full join output schemas nullable where needed
- support full join with non-equal other conditions
- fix full join other-condition execution path
- add targeted tests and design notes

Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-triage-completed labels Jul 2, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign searise for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 2, 2026
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0935459a-5ae4-4720-8c22-f91f47d34ebf

📥 Commits

Reviewing files that changed from the base of the PR and between 63cff9a and f51e263.

📒 Files selected for processing (14)
  • dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp
  • dbms/src/Debug/MockExecutor/JoinBinder.cpp
  • dbms/src/Flash/Coprocessor/DAGUtils.cpp
  • dbms/src/Flash/Coprocessor/JoinInterpreterHelper.cpp
  • dbms/src/Flash/Coprocessor/JoinInterpreterHelper.h
  • dbms/src/Flash/Coprocessor/collectOutputFieldTypes.cpp
  • dbms/src/Flash/Coprocessor/tests/gtest_join_get_kind_and_build_index.cpp
  • dbms/src/Flash/tests/gtest_join_executor.cpp
  • dbms/src/Flash/tests/gtest_spill_join.cpp
  • dbms/src/Interpreters/Join.cpp
  • dbms/src/Interpreters/JoinPartition.cpp
  • dbms/src/Interpreters/JoinUtils.h
  • dbms/src/TestUtils/tests/gtest_mock_executors.cpp
  • docs/note/fullouter_join.md

📝 Walkthrough

Walkthrough

This PR adds full outer join support to TiFlash: protocol/join-kind mapping for TypeFullOuterJoin, nullable schema propagation for output and other-condition columns, validation changes allowing left/right non-equal conditions under full joins, row-flagged hash map execution correctness for full joins with other conditions, plus tests and a design note.

Changes

Full Outer Join Support

Layer / File(s) Summary
Join type mapping and validation guards
dbms/src/Flash/Coprocessor/DAGUtils.cpp, dbms/src/Flash/Coprocessor/JoinInterpreterHelper.cpp, dbms/src/Flash/Coprocessor/JoinInterpreterHelper.h
Maps TypeFullOuterJoin to ASTTableJoin::Kind::Full, rejects cartesian full outer join (empty join keys), returns "FullOuterJoin" name, and allows Full kind in left/right filter validation.
Nullable schema propagation
dbms/src/Flash/Coprocessor/JoinInterpreterHelper.cpp, .h, dbms/src/Flash/Coprocessor/collectOutputFieldTypes.cpp, dbms/src/Debug/MockExecutor/JoinBinder.cpp
Introduces makeLeftJoinSideNullable/makeRightJoinSideNullable helpers and uses them consistently in output column generation, other-condition column generation, field-type collection, and mock join schema building, replacing scattered join-type checks.
Row-flagged map execution correctness
dbms/src/Interpreters/JoinUtils.h, dbms/src/Interpreters/JoinPartition.cpp, dbms/src/Interpreters/Join.cpp, dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp
Treats Full as requiring row-flagged hash maps, adds addNotFoundForFull handling for unmatched probe rows, adjusts handleOtherConditions to keep unmatched rows and null-pad for Full, fixes used-entry marking to skip null pointers, and dispatches Full in the post-probe scan.
Tests and design note
dbms/src/Flash/Coprocessor/tests/gtest_join_get_kind_and_build_index.cpp, dbms/src/Flash/tests/gtest_join_executor.cpp, dbms/src/Flash/tests/gtest_spill_join.cpp, dbms/src/TestUtils/tests/gtest_mock_executors.cpp, docs/note/fullouter_join.md
Adds tests covering join-kind mapping, nullability, validation, executor correctness (with/without other conditions, spill), mock schema construction, plus a design/checklist document.

Estimated code review effort: 4 (Complex) | ~75 minutes

Suggested labels: approved, lgtm

Suggested reviewers: JaySon-Huang, Lloyd-Pottiger

Poem

A rabbit hops through joins both left and right,
Now FULL OUTER blooms in TiFlash's light 🐇
Null-padded rows, no row left behind,
Row-flagged maps keep every kind aligned.
With tests and docs to guide the way—
Hop, hop, hooray for join day! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main change: adding FULL OUTER JOIN support.
Description check ✅ Passed The description follows the template well, with problem summary, changes, checklist, and release note filled in.
Linked Issues check ✅ Passed The PR implements the #10777 scope: full outer join plumbing, cartesian guard, nullable schemas, other conditions, execution fixes, and tests.
Out of Scope Changes check ✅ Passed The code changes appear focused on full outer join support and related tests/docs, with no clear unrelated additions.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@windtalker

Copy link
Copy Markdown
Contributor Author

/run-check-issue-triage-complete

@ti-chi-bot

ti-chi-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@windtalker: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-sanitizer-tsan f51e263 link false /test pull-sanitizer-tsan

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@windtalker

Copy link
Copy Markdown
Contributor Author

/run pull-integration-test

@windtalker

Copy link
Copy Markdown
Contributor Author

/test pull-unit-test

@windtalker

Copy link
Copy Markdown
Contributor Author

/test pull-integration-test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support full outer join

1 participant