fix(EnsureRequirements): remap sort requirement through ProjectionExec on pushdown#23199
fix(EnsureRequirements): remap sort requirement through ProjectionExec on pushdown#23199Jeadie wants to merge 2 commits into
EnsureRequirements): remap sort requirement through ProjectionExec on pushdown#23199Conversation
…n pushdown
When EnsureRequirements relocates a fetch-bearing SortExec below an
order-preserving ProjectionExec, sort_pushdown forwarded the parent
ordering requirement to the child unchanged. A ProjectionExec reports a
fetch() (forwarded from its input) and can renumber/reorder columns, so
the requirement -- expressed in the projection's output schema -- was
pushed into the child schema verbatim. For a projection that reorders the
sort key (e.g. output [id, score, value] over input [id, value, score]),
score@1 (valid above) is pushed down as score@1, which is a different
column below. The relocated SortExec then advertises an ordering its child
cannot provide and SanityCheckPlan rejects the plan:
... does not satisfy order requirements: [score@1 DESC NULLS LAST,
id@0 ASC]. Child-0 order: []
Remap the ordering requirement through the projection's column mapping
before pushing it down (new remap_requirement_through_projection helper).
Each required column at output index i is rewritten to the column the
projection produces at that index (projection.expr()[i]). If a required
column maps to a computed (non-Column) expression, the ordering cannot be
expressed below the projection, so pushdown is declined and the sort is
kept above it. For non-reordering projections the remap is the identity,
so existing plans are unchanged.
Adds unit tests for the remap helper (reorder, softness, multiple
alternatives, computed-column decline) and an end-to-end regression test
asserting the optimized plan passes SanityCheckPlan.
Signed-off-by: jeadie <jack@spice.ai>
Worked exampleBefore
|
EnsureRequirements): remap sort requirement through ProjectionExec on pushdown
Signed-off-by: jeadie <jack@spice.ai>
|
@zhuqi-lucas could you help review this PR? |
zhuqi-lucas
left a comment
There was a problem hiding this comment.
LGTM
Nit: this fix completes the TODO for the fetch-bearing branch but the non-fetch case (|| plan.is::() short-circuit in the second pushdown check) is still declining pushdown. The new comment frames that as intentional, but worth either (a) keeping the TODO marker for the non-fetch case, or (b) explicitly noting why non-fetch projection pushdown is not pursued. Otherwise a future reader could mistake "TODO removed" for "feature complete."
alamb
left a comment
There was a problem hiding this comment.
Thanks @Jeadie and @zhuqi-lucas
@Jeadie could you please help address @zhuqi-lucas 's comments and then I'll plan to merge this PR?
Which issue does this PR close?
None (Happy to open a tracking issue if preferred).
Rationale for this change
EnforceSorting/EnsureRequirementscan produce a physically invalid plan thatSanityCheckPlanrejects:Root cause. In
sort_pushdown.rs::pushdown_requirement_to_children, the fetch-forwarding branchforwards the parent ordering requirement to the child unchanged. A
ProjectionExecreports afetch()(forwarded from its input) and can renumber/reorder columns, so the requirement, expressed in the projection's output schema, is pushed into the child schema verbatim. For a projection that reorders the sort key (e.g. output[id, score, value]over input[id, value, score]),score@1(valid above) is pushed down asscore@1(a different column below). The relocatedSortExecthen advertises an ordering its child cannot provide, andSanityCheckPlanfails.This reproduces whenever a fetch-bearing global sort sits above a
CoalescePartitionsExecover a column-reordering, order-preservingProjectionExecwhose input is already ordered,parallelize_sortssinks the per-partitionSortExecbelow the projection without remapping the key index.What changes are included in this PR?
Remap the ordering requirement through the projection's column mapping before pushing it down (new
remap_requirement_through_projectionhelper):iis rewritten to the column the projection produces at that index (projection.expr()[i]).Column) expression, the ordering cannot be expressed below the projection, so pushdown is declined and the sort is kept above the projection.OrderingRequirementsare preserved; unsatisfiable alternatives are dropped.For non-reordering projections the remap is the identity, so existing plans are unchanged.
Are these changes tested?
Yes.
test_parallelize_sorts_remaps_index_through_reordering_projection) builds the multi-partition reordering-projection plan, runsEnsureRequirementswithrepartition_sortsenabled, and asserts the result passesSanityCheckPlan. It fails without this fix and passes with it.Are there any user-facing changes?
No API changes. Plans that previously failed
SanityCheckPlan(or produced incorrect orderings) for this shape are now valid; all other plans are unchanged.