Skip to content

feat: support extension planner for TableScan#20548

Open
linhr wants to merge 2 commits intoapache:mainfrom
lakehq:table-scan-planner
Open

feat: support extension planner for TableScan#20548
linhr wants to merge 2 commits intoapache:mainfrom
lakehq:table-scan-planner

Conversation

@linhr
Copy link
Contributor

@linhr linhr commented Feb 25, 2026

Which issue does this PR close?

Rationale for this change

Please refer to the issue for context. This PR serves as a proof-of-concept and we can consider merging it if we reach consensus on the design discussed in the issue.

What changes are included in this PR?

The trait method ExtensionPlanner::plan_table_scan() is added so that the user can define physical planning logic for custom table sources.

Are these changes tested?

The changes are accompanied with unit tests.

Are there any user-facing changes?

Yes, a new trait method is added to ExtensionPlanner. This is not a breaking change since the trait method has a default implementation.

@github-actions github-actions bot added the core Core DataFusion crate label Feb 25, 2026
@goldmedal
Copy link
Contributor

Thanks @linhr for working on this! The overall approach looks clean and well-scoped.

One design question: would it be better to try the standard source_as_provider path first, and only fall back to extension planners when the TableSource is not a DefaultTableSource? Something like:

LogicalPlan::TableScan(scan) => {
    if let Some(default_source) = scan.source
        .as_any()
        .downcast_ref::<DefaultTableSource>()
    {
        // existing TableProvider scan logic
    } else {
        // try extension planners for custom TableSource
        for planner in &self.extension_planners {
            if let Some(plan) = planner
                .plan_table_scan(self, scan, session_state)
                .await?
            {
                return Ok(plan);
            }
        }
        plan_err!("No installed planner was able to plan TableScan for custom TableSource: {:?}", scan.table_name)
    }
}

Rationale:
It matches the motivation in #20547 more precisely — extension planners handle custom TableSources that are not TableProviders.
Avoids the overhead of iterating extension planners for the common DefaultTableSource case.
Prevents extension planners from accidentally intercepting normal TableProvider scans.
The tradeoff is that this wouldn't allow overriding planning for TableProvider-backed sources, but that seems like a separate use case with different considerations.

Also, a minor note: the existing plan_extension path validates that the returned ExecutionPlan schema matches the LogicalPlan schema (around line 1649). It might be worth adding a similar check here to catch mismatches early.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support extension planners for TableScan

2 participants