Gene.bordegaray/2026/02/partition index dynamic filters by gene-bordegaray · Pull Request #20331 · apache/datafusion

gene-bordegaray · 2026-02-12T22:56:49Z

Which issue does this PR close?

Closes #20195

Rationale for this change

Dynamic filter pushdown was completely when preserve_file_partitions on due to a correctness bug.

The Problem

When preserve_file_partitions enabled, DataFusion treats file groups as pre-partitioned data. Existing dynamic filtering used hash-based routing which is incompatible with the value-based partitioning that file groups are kept in:

Example:

Table partitioned by col_a:
- Partition 0: col_a = 'A'
- Partition 1: col_a = 'B'
- Partition 2: col_a = 'C'

Dimension Table: values = ['A', 'B']

SELECT * FROM large_table
JOIN small_table ON large_table.col_a = small_table.col_a

Hash routing doesn't work:
- Hash routing: hash('A') % 3 might map to partition 1 (not partition 0)
- File partitioning: 'A' data is in partition 0 (value-based)

For this reason was diabled, this PR re-enables it via PartitionIndex routing for dynamic filters.

What changes are included in this PR?

Partition-Indexed dynamic filtering

New routing mode that uses direct partition-to-partition mapping:

Build partition 0 → filters Probe partition 0
Build partition 1 → filters Probe partition 1
Build partition 2 → filters Probe partition 2

Example:

HashJoinExec: mode=Partitioned, routing=PartitionIndex, on=[col_a = col_b]
    DataSourceExec: table_large, file_groups={3: [col_a=A], [col_a=B], [col_a=C]} predicate=DynamicFilter[
        {0: col_a IN ['A','B']},  -- Partition 0 filtered
        {1: col_a IN ['A','B']},  -- Partition 1 filtered
       {2: col_a IN ['A','B']}   -- Partition 2 filtered (no matches, pruned)
    ]
    DataSourceExec: table_small, values: [col_b='A', col_b='B']

How it works:
- Build partition 0 (col_b='A') creates filter for probe partition 0 (col_a='A')
- Build partition 1 (col_b='B') creates filter for probe partition 1 (col_a='B')
- Probe partition 2 (col_a='C') gets pruned (no matching build partition)

Alignment Detection
Detects compatible partitioning to enable safe optimization:

Both sides file-grouped (value-based partitioning) -> PartitionIndex
Both sides hash-repartitioned (hash-based partitioning) -> CashHash
Both has different partitioning -> Error, this shouldn't happen and can cause incorrect results
match (left.repartitioned, right.repartitioned) {

In the case there is a RepartitionExec in the path leading from the DataSourceExec to either the build or probe side of a Partitioned Hash Join -> Falls back to CaseHash.

The reason is RepartitionExec uses hash(value) % N to distribute rows, breaking the value-based partition alignment. When hash-partitioned, partition 0 no longer contains 'A' exclusively, breaking the partition index assumptions

With hash partitioning, use:

  CASE hash(col_a) % 4
    WHEN 0 THEN filter_partition_0
    WHEN 1 THEN filter_partition_1
    ...
  END

Are these changes tested?

sqlogictests: test_files/preserve_file_partitioning.slt
Integration tests: datafusion/core/tests/physical_optimizer/filter_pushdown.rs
Unit tests: in effected files

Are there any user-facing changes?

Yes a new error message can appear is partition hash joins are not aligned properly and the dynamic filtering display for partition index is a but different then CASE routing.

cc: @NGA-TRAN @LiaCastaneda @adriangb @gabotechs

…umentation :

…02/dyn_filter_partition_indexed

…tions on

gene-bordegaray · 2026-02-13T01:50:08Z

datafusion/core/tests/physical_optimizer/filter_pushdown.rs

+    - HashJoinExec: mode=Partitioned, join_type=Inner, on=[(b@0, a@0)]
+    -   RepartitionExec: partitioning=Hash([b@0], 1), input_partitions=1
+    -     HashJoinExec: mode=Partitioned, join_type=Inner, on=[(c@1, d@0)]
+    -       DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[b, c, y], file_type=test, pushdown_supported=true


no dynamic filter because its the build side of a build a build side...took me a second 😂

gene-bordegaray · 2026-02-13T01:54:50Z

datafusion/datasource-parquet/src/opener.rs

+            // its own filter.
+            predicate = predicate
+                .map(|p| snapshot_physical_expr_for_partition(p, partition_index))
+                .transpose()?;


decided to only do this in the parquet opener, if we did for all files (by default) just do nothing since predicates aren't passed to other opneers. This does mean that users will have to implement this for their on data sources.

Given this is a large PR, didn't want to include logic for a fallback and doing nothing seemed out of place, could still reconsider if others have an opinion.

datafusion/physical-optimizer/src/enforce_distribution.rs

LiaCastaneda

This makes sense to me and will be very helpful for use cases where we want to avoid repartitioning data. My only concern is that API users would need to align the probe and build side partitions, but this seems like a reasonable tradeoff. Let’s see what other contributors think. (this is a partial review I will finish later today or early next week) but until now it's looking good to me :)

datafusion/common/src/config.rs

LiaCastaneda · 2026-02-13T12:55:32Z

datafusion/core/tests/physical_optimizer/enforce_distribution.rs

+    // One side starts with multiple partitions while target is 1. EnforceDistribution inserts a
+    // hash repartition on the left child. The partitioning schemes are now misaligned:
+    // - Left: hash-repartitioned (repartitioned=true)
+    // - Right: file-grouped (repartitioned=false)
+    // This is a correctness bug, so we expect an error.


Can we have the other way around as well? having a Join of type Partitioned and the left perserving file parttioning and the right having RepartitionExec.

LiaCastaneda · 2026-02-13T13:10:53Z

datafusion/core/tests/physical_optimizer/enforce_distribution.rs

+    let optimized = ensure_distribution_helper_transform_up(join, 1)?;
+    assert_plan!(optimized, @r"
+    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(a@0, a@1)]
+      DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet


Would it make sense to display if DataSourceExec is perserving partitioning? something like preserve_partitioning=[bool]? this may be useful for users to know why there is no RepartitionExec in the plan even if the mode is Partitioned

I think this can be a separate PR, but can create issue 😄

datafusion/core/tests/physical_optimizer/filter_pushdown.rs

gene-bordegaray · 2026-02-13T15:07:55Z

This makes sense to me and will be very helpful for use cases where we want to avoid repartitioning data. My only concern is that API users would need to align the probe and build side partitions, but this seems like a reasonable tradeoff. Let’s see what other contributors think. (this is a partial review I will finish later today or early next week) but until now it's looking good to me :)

💯 thank you for the reviews
I know we have discussed this but want to document here, for the API it is clear that partitioning structure is a bit vague. I would like to start an effort to make partitioning a trait that will more clearly define how data is partitioned to eliminate the overload on Hash partitioning.

LiaCastaneda

👍 I think I'm done with my review, overall looks good, just some minor comments.

datafusion/core/tests/physical_optimizer/filter_pushdown.rs

datafusion/physical-optimizer/src/enforce_distribution.rs

LiaCastaneda · 2026-02-13T16:31:33Z

I would like to start an effort to make partitioning a trait that will more clearly define how data is partitioned to eliminate the overload on Hash partitioning.

We need to make sure that in the future it’s easy to revert or migrate users away from index-based routing to their custom Partitioning implementation. Since this does not introduce a new API, I don’t think it should be a problem. This was previously a bug, and with this PR dynamic filtering works, but it’s something to keep in mind.

NGA-TRAN

The approach looks great, Gene. Nice work!

I do have some suggestions on comments and test data to make things clearer for reviewers and future maintennce

datafusion/common/src/config.rs

datafusion/core/tests/physical_optimizer/enforce_distribution.rs

NGA-TRAN · 2026-02-13T14:09:56Z

datafusion/core/tests/physical_optimizer/enforce_distribution.rs

+    config.optimizer.enable_round_robin_repartition = false;
+    config.optimizer.repartition_file_scans = false;
+    config.optimizer.repartition_file_min_size = 1024;
+    config.optimizer.prefer_existing_sort = false;


I will be clearer if you add comments explaining why you need these settings and for which tests.

datafusion/core/tests/physical_optimizer/filter_pushdown.rs

datafusion/datasource-parquet/src/opener.rs

datafusion/physical-expr/src/expressions/dynamic_filters.rs

datafusion/physical-optimizer/src/enforce_distribution.rs

NGA-TRAN · 2026-02-13T19:01:11Z

@LiaCastaneda

My only concern is that API users would need to align the probe and build side partitions, but this seems like a reasonable tradeoff.

Right, usually when users decide to do this custom partitions, they must have a mechanism to enforce it. Thus, I do not think we need to worry about this at this dynamic filtering stage. We only need to provide a way to use the dynamic filtering correctly which is the purpose of this PR.

LiaCastaneda · 2026-02-13T20:44:59Z

datafusion/physical-expr/src/expressions/dynamic_filters.rs

+/// Per-partition filter expressions indexed by partition number.
+type PartitionedFilters = Vec<Option<Arc<dyn PhysicalExpr>>>;


Just one more question -- how would evaluation be done for PartitionedFilters?
My understanding is that each partition would need to first access its corresponding PhysicalExpr and then call evaluate() right? However, the evaluate() trait of PhysicalExpr has no partition number in the args, soevaluate()can't directly integrate PartitionedFilters.
The current evaluate() function remains the same and evaluates inner.expr, which, when we preserve file partitioning, holds nothing (just lit(true) placeholder).

Snapshotting happens before evaluation

The full path in chronological order is:

ParquetOpener::open() is called

snapshot_physical_expr_for_partition(predicate, partition_index) is called -> important to note that we pass the index

snapshot_physical_expr_for_partition replaces the DynamicFilterPhysicalExpr with the filter for the partition on that index (this is a physical expr)

evaluate() is called which uses the snapshot expr (not the DynamicFilterPhysicalExpr) and we don't need to know the partition parameter because it was already dealt with earlier

For the lit(true) concern, if has_partitioned_filters() returns false during snapshotting then we will fallback to lit(true) which yes then we won't evaluate anything. But this is ok behavior because its ok to do this rather than error

The shouldn't happen because, the we wait until the build side is complete before we snapshot so we should always resolve to true.

Maybe to be safe would be good to add a debug statement if has_partitioned_filters() returns false.

Lmk if this makes sense 🙂

ah got it, I missed the fact that this was already integrated into the parquet-source opener, and thought evaluation was different. It's clear now and makes sense to me, thanks for explaining!

…mic_filters

LiaCastaneda

This lgtm 👍 it’s just missing a committer’s review.

adriangb · 2026-02-17T16:58:42Z

datafusion/common/src/config.rs

+        /// CASE-hash routing, but this assumes build/probe partition indices stay aligned for
+        /// partition hash join / dynamic filter consumers.
+        ///
+        /// Misaligned Partitioned Hash Join Example:


Is this something DataFusion users have to think about? I.e. is this something a user can mess up or would it only happen if there was a bug in DataFusion?

Ya, this can be user-triggered, not only a DF bug. If preserve_file_partitions is enabled and the two join sides are not partition-aligned by value/index, partition-index routing is unsafe.

A user can declare to preserve their file partitioning, but if they don't have partitioned data. This will be a bug on the behalf of the user

In EnforceDistribution for incompatible schemes instead of silently allowing incorrect results.

adriangb · 2026-02-17T22:10:45Z

datafusion/physical-plan/src/joins/hash_join/exec.rs

+    fn should_use_partition_index(&self) -> bool {
+        matches!(self.mode, PartitionMode::Partitioned)
+            && self.dynamic_filter_routing_mode
+                == DynamicFilterRoutingMode::PartitionIndex


Would these two ever not match?

Yes, if thre is a partitione dhash join and both sides contain a repartition -> it would use CaseHash

adriangb · 2026-02-17T22:13:48Z

datafusion/physical-expr/src/expressions/dynamic_filters.rs

+/// Snapshot a `PhysicalExpr` tree, replacing any [`DynamicFilterPhysicalExpr`] that
+/// has per-partition data with its partition-specific filter expression.
+/// If a `DynamicFilterPhysicalExpr` does not have partitioned data, it is left unchanged.
+pub fn snapshot_physical_expr_for_partition(
+    expr: Arc<dyn PhysicalExpr>,
+    partition: usize,
+) -> Result<Arc<dyn PhysicalExpr>> {


This is an interesting way to go about it. It does kind of feel to me like we are trying to shoehorn the concept of partitions into PhysicalExpr through the snapshot() API which was never intended for this use case. I feel there is probably a much cleaner way to introduce the concept of a partition to PhysicalExpr

This is an interesting way to go about it. It does kind of feel to me like we are trying to shoehorn the concept of partitions into PhysicalExpr through the snapshot() API which was never intended for this use case. I feel there is probably a much cleaner way to introduce the concept of a partition to PhysicalExpr
I agree that overloading the snapshot() is a bit odd. Here are the two ways this can be improved upon:

1. Transmit Partition Info in the Data:
This is the suggestion @adriangb provided. I think it is and interesting approach but has large implications. Injecting a partition-id or something similar would mean changing projections, filter handling etc. With the objective of this PR being to enable dynamic filtering for file preserved partitioning, this seems like too large of a change.

2. Runtime Partition Bind:
Introduce a runtime wrapper PartitionBoundDynamicFilterPhysicalExpr. The parquet opener binds the dynamic filters to partition i by inserting this new wrapper rather than a static snapshot. Then, this wrapper is resolved at evaluation time. This largely avoids the shoehorning partitioning in PhysicalExpr through snapshot().

With this said, I am aware this approach is not the cleanest solution for this issue, but it does set us up for follow-up work. This can become a generic runtime bind API that hooks on PhysicalExpr, by default we will have it be a no-op. Then for DynamicFilterPhysicalExpr we will implement the hook and eliminate our wrapper. This will provide support for any future runtime-dependent expressions without the ad-hoc plumbing we did before and no schema / data changes.

I decided to leave the full generic solution our of this PR to keep these changes as local as I can, but tried to lay a path forward that is maintainable. Let me know thoughts on this approach.

cc: @adriangb @LiaCastaneda

gabotechs

I'm not familiar with this code, so I'm ramping up with it, but you can expect me to take some time until I start contributing more useful comments.

For now, the big comment I left is mostly intended for triggering some discussions rather than actually jumping into code.

One way of getting your PRs in faster, is to slice them in smaller chunks, for example, maybe shipping the failing tests for the issue this is intended to solve first? I have to say that, even if that's a generally good advice, it might not be good for some PRs. It might not be for this one for example.

gabotechs · 2026-02-25T08:25:09Z

datafusion/common/src/config.rs

+        /// Note for partitioned hash join dynamic filtering:
+        /// preserving file partitions can allow partition-index routing (`i -> i`) instead of
+        /// CASE-hash routing, but this assumes build/probe partition indices stay aligned for
+        /// partition hash join / dynamic filter consumers.
+        ///


This assumption is not specific to dynamic filters. If partition indices are not aligned, data returned in the join would be wrong whether dynamic filters are there or not.

As I don't think this advice is specific to dynamic filters, I'd try to keep this doc comment more minimal. Note that this is supposed to be rendered not only as docs, but also as part of a SHOW ALL query, which might not render nicely, so having ASCII graphics is probably the not most render-friendly thing for this specific case.

gabotechs · 2026-02-25T09:34:26Z

datafusion/physical-optimizer/src/enforce_distribution.rs

+    // - File-grouped: partition 0 = all rows where column="A" (value-based)
+    // - Hash-repartitioned: partition 0 = all rows where hash(column) % N == 0 (hash-based)
+    // These are incompatible, so the join will miss matching rows.
+    plan = if let Some(hash_join) = plan.as_any().downcast_ref::<HashJoinExec>()


Pretty smart and clean way of propagating wether data is getting repartitioned across steps 👍

gabotechs · 2026-02-25T09:42:15Z

datafusion/physical-optimizer/src/enforce_distribution.rs

+                !matches!(
+                    repartition.partitioning(),
+                    Partitioning::UnknownPartitioning(_)
+                )
+            } else if child_context.plan.children().is_empty() {


It might make sense to not use a matches! macro here, and just manually match each possibility. The reason is that, if in the future a new enum entry is added to Partitioning, the compiler will force the dev to chose a right value here manually, forcing them to think about it.

gabotechs · 2026-02-25T10:43:01Z

datafusion/physical-expr/src/expressions/dynamic_filters.rs

+    /// Per-partition filter expressions for partition-index routing.
+    /// When both sides of a hash join preserve their file partitioning (no RepartitionExec(Hash)),
+    /// build-partition i corresponds to probe-partition i. This allows storing per-partition
+    /// filters so that each partition only sees its own bounds, giving tighter filtering.
+    partitioned_exprs: PartitionedFilters,
 }


I wonder if there is a better alternative than leaking the concept of plan partitioning into dynamic filters.

I do agree with @adriangb about leaking the concept of partitioning to expressions. I'd say:

If we want expressions to be aware of partitions, let's do a proper API design that is as clean and future proof as possible, and use it here.

If we want expressions to be agnostic from partitions, let's try to find a way implement this so that DynamicFilterExpressions are still agnostic from partitions.

It would be nice to avoid middle ground solutions that exists just for the sake of shipping PRs faster.

Some ideas that come to mind for keeping dynamic filters agnostic from plan partitioning in its structs:

Introduce a new __partition_index REE column or something similar

In the record batch right before calling evalute() on the dynamic filter expression, we could introduce a new column with the partition index. I think you already explored this without success, but I would like to understand better what blockers specifically you faced, I do see this approach yielding a potential good result.

Use Arrow's Schema metadata for threading the partition index

Arrow's schemas have the concept of metadata that can be used for threading arbitrary information. I think this should also be considered "leaking partitioning details", but I wonder if this brings an opportunity of making it cleaner?

Letting the hash join hold a Vec instead of a single HashJoinExecDynamicFilter

That way, there is as many HashJoinExecDynamicFilter as partitions, and they behave exactly the same as before, but as if there was only 1 partition.

In the record batch right before calling evalute() on the dynamic filter expression, we could introduce a new column with the partition index. I think you already explored this without success, but I would like to understand better what blockers specifically you faced, I do see this approach yielding a potential good result.

I also was thinking this approach should work.

Just throwing darts at the wall in terms of helping w/ implementation: you could create a marker expression e.g. impl PhysicalExpr for PartitionLiteral and then modify the machinery that replaces hive partition values in scans w/ the actual literals to also replace that with the DataFusion partition integer.

datafusion/datafusion/physical-expr-adapter/src/schema_rewriter.rs

Lines 46 to 83 in e937cad

/// Replace column references in the given physical expression with literal values.

///

/// Some use cases for this include:

/// - Partition column pruning: When scanning partitioned data, partition column references

/// can be replaced with their literal values for the specific partition being scanned.

/// - Constant folding: In some cases, columns that can be proven to be constant

/// from statistical analysis may be replaced with their literal values to optimize expression evaluation.

/// - Filling in non-null default values: in a custom [`PhysicalExprAdapter`] implementation,

/// column references can be replaced with default literal values instead of nulls.

///

/// # Arguments

/// - `expr`: The physical expression in which to replace column references.

/// - `replacements`: A mapping from column names to their corresponding literal `ScalarValue`s.

/// Accepts various HashMap types including `HashMap<&str, &ScalarValue>`,

/// `HashMap<String, ScalarValue>`, `HashMap<String, &ScalarValue>`, etc.

///

/// # Returns

/// - `Result<Arc<dyn PhysicalExpr>>`: The rewritten physical expression with columns replaced by literals.

pub fn replace_columns_with_literals<K, V>(

expr: Arc<dyn PhysicalExpr>,

replacements: &HashMap<K, V>,

) -> Result<Arc<dyn PhysicalExpr>>

where

K: Borrow<str> + Eq + Hash,

V: Borrow<ScalarValue>,

{

expr.transform_down(|expr| {

if let Some(column) = expr.as_any().downcast_ref::<Column>()

&& let Some(replacement_value) = replacements.get(column.name())

{

return Ok(Transformed::yes(expressions::lit(

replacement_value.borrow().clone(),

)));

}

Ok(Transformed::no(expr))

})

.data()

}

datafusion/datafusion/datasource-parquet/src/opener.rs

Lines 217 to 261 in e937cad

// Build a combined map for replacing column references with literal values.

// This includes:

// 1. Partition column values from the file path (e.g., region=us-west-2)

// 2. Constant columns detected from file statistics (where min == max)

//

// Although partition columns *are* constant columns, we don't want to rely on

// statistics for them being populated if we can use the partition values

// (which are guaranteed to be present).

//

// For example, given a partition column `region` and predicate

// `region IN ('us-east-1', 'eu-central-1')` with file path

// `/data/region=us-west-2/...`, the predicate is rewritten to

// `'us-west-2' IN ('us-east-1', 'eu-central-1')` which simplifies to FALSE.

//

// While partition column optimization is done during logical planning,

// there are cases where partition columns may appear in more complex

// predicates that cannot be simplified until we open the file (such as

// dynamic predicates).

let mut literal_columns: HashMap<String, ScalarValue> = self

.table_schema

.table_partition_cols()

.iter()

.zip(partitioned_file.partition_values.iter())

.map(|(field, value)| (field.name().clone(), value.clone()))

.collect();

// Add constant columns from file statistics.

// Note that if there are statistics for partition columns there will be overlap,

// but since we use a HashMap, we'll just overwrite the partition values with the

// constant values from statistics (which should be the same).

literal_columns.extend(constant_columns_from_stats(

partitioned_file.statistics.as_deref(),

&logical_file_schema,

));

// Apply literal replacements to projection and predicate

let mut projection = self.projection.clone();

let mut predicate = self.predicate.clone();

if !literal_columns.is_empty() {

projection = projection.try_map_exprs(|expr| {

replace_columns_with_literals(Arc::clone(&expr), &literal_columns)

})?;

predicate = predicate

.map(|p| replace_columns_with_literals(p, &literal_columns))

.transpose()?;

}

You'd have to contend with "what if this expression is evaluated without replacement". I'm not sure when / why that would happen, or if it's okay that the expression only works in the context of a scan. Maybe just returning null by default is reasonable behavior.

Is the generic runtime binding for Physical Expressions not a good path forward? I have changed this some @adriangb last review and think it will provide a clean API once the follow up work is done.

It will eliminate the wrapper and create an API over any physical expression to have runtime properties. This feels like a good approach since it's easy to comprehend, no changes to the schema or injecting columns, and introduced what I think will be a useful property to physical expressions as cost based and adaptive query execution is adopted.

Cc: @gabotechs

I don't see the approach leak the idea of partitioning into dynamic filters, rather its goal will be to make physical expression aware of partitioning which I think is feasible.

Maybe I am misunderstanding what is good practice in exposing properties to parts of the plan but it seems like a good approach to have expressions aware of runtime features

Here is a fleshed out PR that is my idea for the genreic runtime binder: #20549

this may make my vision more clear 👍

cc: @adriangb @gabotechs

Would you be fine in investigating how other engines handle this? Looking at successful implementations of this in other engines might produce good arguments for a path forward.

gene-bordegaray added 10 commits February 8, 2026 14:20

add partition indexed dyn filtering when no repartition

94f6864

fix repartition in nested children, and combine dyn filters

c2095bf

move partition index detection to planning time and add alignment doc…

d97d66e

…umentation :

move expression into same lock as state

8cf514d

have repartition detection in bottom up traversal

37bf985

Merge remote-tracking branch 'origin/main' into gene.bordegaray/2026/…

d9863c9

…02/dyn_filter_partition_indexed

renable dyn fiters on partitioned hash joins when preserve file parti…

d7bbae9

…tions on

error on misaligned partitons

1d8ca7f

add nested hash join tests

0374129

add misalignment example in docs

6131fe4

fix clippy

4e23ab5

gene-bordegaray mentioned this pull request Feb 13, 2026

DataFusion Tantviy Table Providers quickwit-oss/tantivy#2839

Draft

gene-bordegaray commented Feb 13, 2026

View reviewed changes

datafusion/physical-optimizer/src/enforce_distribution.rs Outdated Show resolved Hide resolved

LiaCastaneda reviewed Feb 13, 2026

View reviewed changes

datafusion/core/tests/physical_optimizer/filter_pushdown.rs Outdated Show resolved Hide resolved

datafusion/physical-optimizer/src/enforce_distribution.rs Outdated Show resolved Hide resolved

NGA-TRAN approved these changes Feb 13, 2026

View reviewed changes

LiaCastaneda reviewed Feb 13, 2026

View reviewed changes

gene-bordegaray added 4 commits February 14, 2026 11:01

reviews

c3216e4

Merge branch 'main' into gene.bordegaray/2026/02/partition_index_dyna…

302a4be

…mic_filters

make test names clearer

75c6c1f

Merge branch 'main' into gene.bordegaray/2026/02/partition_index_dyna…

01ba7b0

…mic_filters

LiaCastaneda approved these changes Feb 17, 2026

View reviewed changes

adriangb reviewed Feb 17, 2026

View reviewed changes

gene-bordegaray added 2 commits February 18, 2026 11:26

make partition dyn filter expression a runtime binder

995d435

make binding a method of DynFilterExpr

1e15ecb

gabotechs reviewed Feb 25, 2026

View reviewed changes

		/// Per-partition filter expressions indexed by partition number.
		type PartitionedFilters = Vec<Option<Arc<dyn PhysicalExpr>>>;

	/// Replace column references in the given physical expression with literal values.
	///
	/// Some use cases for this include:
	/// - Partition column pruning: When scanning partitioned data, partition column references
	/// can be replaced with their literal values for the specific partition being scanned.
	/// - Constant folding: In some cases, columns that can be proven to be constant
	/// from statistical analysis may be replaced with their literal values to optimize expression evaluation.
	/// - Filling in non-null default values: in a custom [`PhysicalExprAdapter`] implementation,
	/// column references can be replaced with default literal values instead of nulls.
	///
	/// # Arguments
	/// - `expr`: The physical expression in which to replace column references.
	/// - `replacements`: A mapping from column names to their corresponding literal `ScalarValue`s.
	/// Accepts various HashMap types including `HashMap<&str, &ScalarValue>`,
	/// `HashMap<String, ScalarValue>`, `HashMap<String, &ScalarValue>`, etc.
	///
	/// # Returns
	/// - `Result<Arc<dyn PhysicalExpr>>`: The rewritten physical expression with columns replaced by literals.
	pub fn replace_columns_with_literals<K, V>(
	expr: Arc<dyn PhysicalExpr>,
	replacements: &HashMap<K, V>,
	) -> Result<Arc<dyn PhysicalExpr>>
	where
	K: Borrow<str> + Eq + Hash,
	V: Borrow<ScalarValue>,
	{
	expr.transform_down(\|expr\| {
	if let Some(column) = expr.as_any().downcast_ref::<Column>()
	&& let Some(replacement_value) = replacements.get(column.name())
	{
	return Ok(Transformed::yes(expressions::lit(
	replacement_value.borrow().clone(),
	)));
	}
	Ok(Transformed::no(expr))
	})
	.data()
	}

	// Build a combined map for replacing column references with literal values.
	// This includes:
	// 1. Partition column values from the file path (e.g., region=us-west-2)
	// 2. Constant columns detected from file statistics (where min == max)
	//
	// Although partition columns are constant columns, we don't want to rely on
	// statistics for them being populated if we can use the partition values
	// (which are guaranteed to be present).
	//
	// For example, given a partition column `region` and predicate
	// `region IN ('us-east-1', 'eu-central-1')` with file path
	// `/data/region=us-west-2/...`, the predicate is rewritten to
	// `'us-west-2' IN ('us-east-1', 'eu-central-1')` which simplifies to FALSE.
	//
	// While partition column optimization is done during logical planning,
	// there are cases where partition columns may appear in more complex
	// predicates that cannot be simplified until we open the file (such as
	// dynamic predicates).
	let mut literal_columns: HashMap<String, ScalarValue> = self
	.table_schema
	.table_partition_cols()
	.iter()
	.zip(partitioned_file.partition_values.iter())
	.map(\|(field, value)\| (field.name().clone(), value.clone()))
	.collect();
	// Add constant columns from file statistics.
	// Note that if there are statistics for partition columns there will be overlap,
	// but since we use a HashMap, we'll just overwrite the partition values with the
	// constant values from statistics (which should be the same).
	literal_columns.extend(constant_columns_from_stats(
	partitioned_file.statistics.as_deref(),
	&logical_file_schema,
	));

	// Apply literal replacements to projection and predicate
	let mut projection = self.projection.clone();
	let mut predicate = self.predicate.clone();
	if !literal_columns.is_empty() {
	projection = projection.try_map_exprs(\|expr\| {
	replace_columns_with_literals(Arc::clone(&expr), &literal_columns)
	})?;
	predicate = predicate
	.map(\|p\| replace_columns_with_literals(p, &literal_columns))
	.transpose()?;
	}

Conversation

gene-bordegaray commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gene-bordegaray Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LiaCastaneda left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gene-bordegaray commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LiaCastaneda left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LiaCastaneda commented Feb 13, 2026

Uh oh!

NGA-TRAN left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NGA-TRAN commented Feb 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gene-bordegaray Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LiaCastaneda Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LiaCastaneda left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gene-bordegaray Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

gene-bordegaray commented Feb 12, 2026 •

edited

Loading

gene-bordegaray Feb 13, 2026 •

edited

Loading

gene-bordegaray commented Feb 13, 2026 •

edited

Loading

gene-bordegaray Feb 13, 2026 •

edited

Loading

LiaCastaneda Feb 14, 2026 •

edited

Loading

gene-bordegaray Feb 17, 2026 •

edited

Loading

gabotechs left a comment •

edited

Loading

gabotechs Feb 25, 2026 •

edited

Loading

Introduce a new `__partition_index` REE column or something similar

gene-bordegaray Feb 25, 2026 •

edited

Loading