[SPARK-57056][SQL] Add `SupportsBranching` DSv2 interface and branching DDL by viirya · Pull Request #56102 · apache/spark

viirya · 2026-05-25T17:58:39Z

What changes were proposed in this pull request?

Add a standard DSv2 mix-in interface SupportsBranching, plus the SQL surface to manage branches:

public interface SupportsBranching extends Table {
    TableBranch createBranch(String name, OptionalLong sourceSnapshotId);
    default TableBranch replaceBranch(String name, OptionalLong sourceSnapshotId);
    boolean dropBranch(String name);
    TableBranch fastForward(String branch, String targetBranch);
    default TableBranch[] listBranches();
}

With companion value type TableBranch(name, snapshotId, creationTimeMs) and SupportsBranching.BranchAlreadyExistsException for the duplicate-create case.

New DDL:

ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name [AS OF VERSION <integer>]
ALTER TABLE t DROP BRANCH [IF EXISTS] name
ALTER TABLE t FAST FORWARD branch TO target
SHOW BRANCHES (FROM | IN) t

Implementation:

Define the interface and TableBranch value type under sql/catalyst/.../connector/catalog/.
Extend the ANTLR grammar with the four DDL forms; register BRANCH, BRANCHES, FAST, FORWARD as non-reserved keywords; update docs/sql-ref-ansi-compliance.md.
Add logical plans (CreateBranch / DropBranch / FastForwardBranch / ShowBranches) and exec nodes; dispatch through ResolvedTable in DataSourceV2Strategy.
Add DataSourceV2Implicits.asBranchable and QueryCompilationErrors.tableDoesNotSupportBranchingError so non-branching tables fail with a clear message.
Add error condition CREATE_BRANCH_WITH_IF_NOT_EXISTS_AND_REPLACE for the conflicting clauses.
Implement SupportsBranching on InMemoryTable for testing.

Reads and writes against a specific branch (SELECT ... FOR BRANCH 'x', INSERT ... FOR BRANCH 'x') are out of scope here and are added in the follow-up SPARK-57057.

Why are the changes needed?

Apache Iceberg and similar table formats support named branches as a first-class concept, but Spark today only exposes branching through connector-specific SQL extensions (e.g. IcebergSparkSessionExtensions). A standard DSv2 interface lets any data source expose branching through built-in Spark SQL, the same way SupportsDelete / TruncatableTable / SupportsRowLevelOperations standardize their respective capabilities.

Does this PR introduce any user-facing change?

Yes. New SQL DDL is recognized:

ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name [AS OF VERSION <integer>]
ALTER TABLE t DROP BRANCH [IF EXISTS] name
ALTER TABLE t FAST FORWARD branch TO target
SHOW BRANCHES (FROM | IN) t

Data sources that do not implement SupportsBranching are unaffected — running the new DDL against them raises AnalysisException ("does not support branching"). Four new non-reserved keywords (BRANCH, BRANCHES, FAST, FORWARD) are added; they remain usable as identifiers in non-DDL contexts.

How was this patch tested?

DDLParserSuite: 5 new parser tests covering all four DDL forms and the IF NOT EXISTS / OR REPLACE conflict error.
SupportsBranchingSuite: 12 new integration tests exercising the new DDL end-to-end against InMemoryTable, including positive cases for each operation and negative cases for duplicates, missing branches, fast-forward direction, and non-branching tables.
All 152 existing tests in DDLParserSuite still pass; TableIdentifierParserSuite (7 tests) confirms the new keywords don't regress identifier handling.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.7)

dongjoon-hyun

Could you address two scalastyle check failures?

Scalastyle checks failed at following occurrences:
[error] /__w/spark-1/spark-1/sql/core/src/test/scala/org/apache/spark/sql/connector/SupportsBranchingSuite.scala:154:24: 
[error]       Are you sure that you want to use toUpperCase or toLowerCase without the root locale? In most cases, you
[error]       should use toUpperCase(Locale.ROOT) or toLowerCase(Locale.ROOT) instead.
[error]       If you must use toUpperCase or toLowerCase without the root locale, wrap the code block with
[error]       // scalastyle:off caselocale
[error]       .toUpperCase
[error]       .toLowerCase
[error]       // scalastyle:on caselocale
[error]     
[error] /__w/spark-1/spark-1/sql/core/src/test/scala/org/apache/spark/sql/connector/SupportsBranchingSuite.scala:149:68: Use java.util.Set.of() instead.
[error] Total time: 76 s (0:01:16.0), completed May 25, 2026, 6:42:24 PM

dongjoon-hyun · 2026-05-26T01:50:47Z

+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.NoSuchElementException;


Does this interface need to rely on NoSuchElementException?

Good catch — relying on the generic JDK NoSuchElementException makes the interface semantically weaker and inconsistent with the rest of the DSv2 API (e.g. NoSuchTableException, NoSuchPartitionException).

Fixed in the latest push: dedicated typed exceptions BranchNotFoundException and InvalidFastForwardException are now defined alongside BranchAlreadyExistsException, and the java.util.NoSuchElementException import is gone.

dongjoon-hyun · 2026-05-26T01:52:05Z

+  /**
+   * Thrown when a branch with the requested name already exists.
+   */
+  class BranchAlreadyExistsException extends RuntimeException {


What about something like BranchNotFound or InvalidFastForward?

Done. The latest push adds both BranchNotFoundException and InvalidFastForwardException as nested classes alongside BranchAlreadyExistsException, following the same naming pattern (Exception suffix). Each method's @throws javadoc now references the new typed exception.

dongjoon-hyun · 2026-05-26T01:52:41Z

+  override def fastForward(branchName: String, targetBranchName: String): TableBranch = {
+    val current = branches.get(branchName)
+    if (current == null) {
+      throw new java.util.NoSuchElementException(s"Branch not found: $branchName")


This looks wired to me.

Agreed — switched to the typed BranchNotFoundException introduced in SupportsBranching (per your comment on the interface).

dongjoon-hyun · 2026-05-26T01:53:03Z

+    }
+    val target = branches.get(targetBranchName)
+    if (target == null) {
+      throw new java.util.NoSuchElementException(s"Branch not found: $targetBranchName")


ditto.

Please correct me if this is the standard of DSv2 and Apache Iceberg.

Same fix — switching to BranchNotFoundException. The IllegalArgumentException for the fast-forward direction check is also now InvalidFastForwardException for the same reason.

On the DSv2 / Iceberg convention question:

Spark's DSv2 surface uses typed NoSuch*Exception classes (NoSuchTableException, NoSuchPartitionException, NoSuchNamespaceException) for missing-entity errors; the plain JDK NoSuchElementException is not used.

Iceberg doesn't have a standardized DSv2 interface for branching at all — internally it uses ValidationException / IllegalArgumentException in ManageSnapshots / branching procedures, but that's an Iceberg-internal choice rather than a DSv2 convention. Since this PR is proposing the DSv2 interface, I think we should follow Spark's DSv2 naming here.

… DDL Add a SupportsBranching mix-in interface for DSv2 tables so data sources can expose table branching through standard Spark SQL DDL: ALTER TABLE t CREATE [OR REPLACE] BRANCH [IF NOT EXISTS] name [AS OF VERSION <id>] ALTER TABLE t DROP BRANCH [IF EXISTS] name ALTER TABLE t FAST FORWARD branch TO target SHOW BRANCHES (FROM|IN) t The interface defines createBranch / dropBranch / fastForward / listBranches operations and a TableBranch value type. Reads and writes against a specific branch are not part of this change. Includes parser grammar, logical plans, analyzer dispatch through ResolvedTable, exec nodes, an in-memory implementation on InMemoryTable for testing, and unit + integration tests. Co-authored-by: Claude Code

viirya · 2026-05-26T04:32:35Z

Thanks for the review @dongjoon-hyun!

Force-pushed a13783190d4 addressing all four review comments:

Removed the dependency on java.util.NoSuchElementException; introduced typed BranchNotFoundException and InvalidFastForwardException alongside BranchAlreadyExistsException, following the existing DSv2 NoSuch*Exception naming pattern.
Updated InMemoryTable.fastForward and DropBranchExec to throw the new typed exceptions.
Fixed the two scalastyle errors (toLowerCase without Locale.ROOT and util.Collections.emptySet()) in SupportsBranchingSuite.
Updated the suite to intercept the new typed exceptions.

sql/Test/scalastyle is clean locally and the full SupportsBranchingSuite still passes. PTAL when you get a chance.

dongjoon-hyun

Thank you for updating.

Like TableCatalog interface, please make all newly-added exceptions (e.g. BranchNotFoundException) extend AnalysisException and define them in org.apache.spark.sql.catalyst.analysis package.

spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java

Lines 22 to 24 in 5477eea

    
           import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException; 
        
           import org.apache.spark.sql.catalyst.analysis.NoSuchTableException; 
        
           import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException;

spark/sql/api/src/main/scala/org/apache/spark/sql/catalyst/analysis/noSuchItemsExceptions.scala

Lines 100 to 105 in 5477eea

    
           class NoSuchTableException private ( 
        
               message: String, 
        
               cause: Option[Throwable], 
        
               errorClass: Option[String], 
        
               messageParameters: Map[String, String]) 
        
               extends AnalysisException(

dongjoon-hyun · 2026-05-26T16:51:35Z

+    | ALTER TABLE identifierReference
+        DROP BRANCH (IF EXISTS)? branch=identifier                     #dropBranch
+    | ALTER TABLE identifierReference
+        FAST FORWARD branch=identifier TO target=identifier            #fastForwardBranch


For SQL syntax:

When we use this as verb, this should be one-word, fast-forward or fastforward.

To be consistent with CREATE BRANCH and DROP BRNCH, we may need to define FASTFORWARD BRANCH like the following.

CREATE BRANCH X DROP BRNCH X FASTFORWARD BRANCH X

viirya mentioned this pull request May 25, 2026

[SPARK-57057][SQL] Target a specific branch on SELECT / INSERT via temporal clause and session config #56103

Open

viirya force-pushed the SPARK-57056 branch from 42d537e to a41eada Compare May 25, 2026 18:22

dongjoon-hyun reviewed May 26, 2026

View reviewed changes

viirya force-pushed the SPARK-57056 branch from a41eada to 9043974 Compare May 26, 2026 00:44

dongjoon-hyun changed the title ~~[SPARK-57056][SQL] Add SupportsBranching DSv2 interface and branching DDL~~ [SPARK-57056][SQL] Add SupportsBranching DSv2 interface and branching DDL May 26, 2026

dongjoon-hyun reviewed May 26, 2026

View reviewed changes

viirya force-pushed the SPARK-57056 branch from 9043974 to d4a1263 Compare May 26, 2026 03:50

viirya force-pushed the SPARK-57056 branch from d4a1263 to a137831 Compare May 26, 2026 04:31

dongjoon-hyun reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57056][SQL] Add `SupportsBranching` DSv2 interface and branching DDL#56102

[SPARK-57056][SQL] Add `SupportsBranching` DSv2 interface and branching DDL#56102
viirya wants to merge 1 commit into
apache:masterfrom
viirya:SPARK-57056

viirya commented May 25, 2026

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun May 26, 2026

Uh oh!

viirya May 26, 2026

Uh oh!

dongjoon-hyun May 26, 2026

Uh oh!

viirya May 26, 2026

Uh oh!

dongjoon-hyun May 26, 2026

Uh oh!

viirya May 26, 2026

Uh oh!

dongjoon-hyun May 26, 2026

Uh oh!

viirya May 26, 2026

Uh oh!

viirya commented May 26, 2026

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		package org.apache.spark.sql.connector.catalog;

		import java.util.NoSuchElementException;

	import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
	import org.apache.spark.sql.catalyst.analysis.NoSuchTableException;
	import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException;

	class NoSuchTableException private (
	message: String,
	cause: Option[Throwable],
	errorClass: Option[String],
	messageParameters: Map[String, String])
	extends AnalysisException(

Conversation

viirya commented May 25, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented May 26, 2026

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants