Skip to content

refactor: make file-statistics cache keys schema-aware#23201

Open
Phoenix500526 wants to merge 5 commits into
apache:mainfrom
Phoenix500526:issue/23072
Open

refactor: make file-statistics cache keys schema-aware#23201
Phoenix500526 wants to merge 5 commits into
apache:mainfrom
Phoenix500526:issue/23072

Conversation

@Phoenix500526

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

File statistics are computed against a specific file_schema (their
column_statistics are positional, one per column), but the file-statistics
cache was keyed only by table and path. Reading the same path under a different
schema could therefore reuse statistics whose columns no longer line up,
panicking during statistics projection.

#22950 worked around this by bypassing the file-statistics cache entirely for
anonymous explicit-schema reads — correct, but it gave up cache reuse for them
(every such read recomputes statistics). #23072 asks to make the cache itself
schema-aware so those reads can reuse the cache safely instead of skipping it.

What changes are included in this PR?

  • Add SchemaFingerprint — the per-column (name, data_type, nullable) of a
    file_schema, in order — and FileStatisticsCacheKey { table, path, schema },
    and key the file-statistics cache on it (FileStatisticsCache is now
    dyn Cache<FileStatisticsCacheKey, CachedFileMetadata>).
  • ListingTable::do_collect_statistics_and_ordering builds the key with the
    file_schema fingerprint and uses the shared cache directly. The fix: isolate anonymous file statistics cache #22950
    bypass (statistics_cache helper / schema_source-based skip) is removed:
    different schemas now land in distinct entries (no stale cross-schema reuse),
    while a repeated read of the same schema reuses its entry.
  • The fingerprint deliberately excludes field/schema metadata (it cannot
    affect statistics, and including it would needlessly fragment the cache) and
    partition columns (partition statistics are computed separately, outside this
    cache).
  • Table-drop invalidation is unchanged: drop_table_entries matches on
    CacheKey::table_ref(), which still returns the table, so all schema variants
    for a dropped table are removed together.
  • The list-files cache continues to key on TableScopedPath.

Are these changes tested?

Yes.

  • Updated the fix: isolate anonymous file statistics cache #22950 regression test
    (anonymous_parquet_stats_cache_with_explicit_wider_schema): the wider
    explicit-schema read now lands in its own cache entry (2 entries, was 1 under
    the bypass) with correct statistics and no panic, and a repeated read of that
    schema is served from the cache (a cache hit, no new entry).
  • Added unit tests for SchemaFingerprint: it distinguishes nullability and
    field order, and ignores field/schema metadata.
  • cargo test for the file_statistics integration module and the
    datafusion-execution cache tests (including drop_table_entries) pass, along
    with cargo fmt --all and cargo clippy --all-targets --all-features -- -D warnings for the touched crates.

Are there any user-facing changes?

No change to query results, physical plans, or the serialized (proto) wire
format; file statistics are computed exactly as before.

One public API change (please add the api change label): the
FileStatisticsCache type alias now uses FileStatisticsCacheKey instead of
TableScopedPath as its key. Code that constructed keys for this cache directly
must switch to FileStatisticsCacheKey. SchemaFingerprint and
FileStatisticsCacheKey are newly public; TableScopedPath remains (still used
by the list-files cache). cargo-semver-checks will flag the key-type change,
which is expected.

@github-actions github-actions Bot added core Core DataFusion crate catalog Related to the catalog crate execution Related to the execution crate labels Jun 26, 2026

@mkleen mkleen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this. I left a few comments.

Comment thread datafusion/execution/src/cache/mod.rs
Comment thread datafusion/execution/src/cache/mod.rs
@adriangb

Copy link
Copy Markdown
Contributor

I am going to run the wide_schema benchmarks here. I am afraid that any change touching schemas is susceptible to introduce O(num_columns^X) operations.

@adriangb

Copy link
Copy Markdown
Contributor

run bechmark wide_schema

env:
  DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@adriangb

Copy link
Copy Markdown
Contributor

run bechmark wide_schema

@adriangb

Copy link
Copy Markdown
Contributor

run bechmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@mkleen

mkleen commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

run bechmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

This is cool, i did not know this.

@adriangb

Copy link
Copy Markdown
Contributor

run benchmark wide_schema

env:
  DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@adriangb

Copy link
Copy Markdown
Contributor

run benchmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@adriangb

Copy link
Copy Markdown
Contributor

run benchmark wide_schema

@adriangb

Copy link
Copy Markdown
Contributor

run bechmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

This is cool, i did not know this.

Yep the idea here is to run baseline w/o cache and this branch w/ cache. Orthogonal to this PR but I want to see how it looks like.

Unfortunately I had a typo in benchmark, sorry for the noise.

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4808656966-702-b4kn8 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (a13e269) to ff677c4 (merge-base) diff using: wide_schema
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4808667999-703-92qlv 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (a13e269) to ff677c4 (merge-base) diff using: wide_schema
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4808669794-704-whlqh 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (a13e269) to ff677c4 (merge-base) diff using: wide_schema
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.00     79.8±0.27ms        ? ?/sec    1.00     79.8±0.18ms        ? ?/sec
wide_schema/Q01_wide      1.00   1025.8±5.54ms        ? ?/sec    1.06   1087.0±3.82ms        ? ?/sec
wide_schema/Q02_narrow    1.00      5.9±0.09ms        ? ?/sec    1.05      6.2±0.05ms        ? ?/sec
wide_schema/Q02_wide      1.00    899.4±3.46ms        ? ?/sec    1.08    975.5±2.01ms        ? ?/sec
wide_schema/Q03_narrow    1.00     14.8±0.23ms        ? ?/sec    1.02     15.0±0.24ms        ? ?/sec
wide_schema/Q03_wide      1.00    912.9±6.88ms        ? ?/sec    1.07    977.0±3.51ms        ? ?/sec
wide_schema/Q04_narrow    1.00     37.2±0.24ms        ? ?/sec    1.01     37.6±0.20ms        ? ?/sec
wide_schema/Q04_wide      1.00    990.8±6.47ms        ? ?/sec    1.08   1068.2±6.72ms        ? ?/sec

Resource Usage

wide_schema — base (merge-base)

Metric Value
Wall time 980.2s
Peak memory 1.2 GiB
Avg memory 98.4 MiB
CPU user 399.6s
CPU sys 58.7s
Peak spill 0 B

wide_schema — branch

Metric Value
Wall time 975.2s
Peak memory 1.2 GiB
Avg memory 108.3 MiB
CPU user 386.8s
CPU sys 50.4s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.00     80.3±0.44ms        ? ?/sec    1.00     80.5±0.49ms        ? ?/sec
wide_schema/Q01_wide      1.00   1016.6±5.73ms        ? ?/sec    1.07   1085.9±8.10ms        ? ?/sec
wide_schema/Q02_narrow    1.00      5.8±0.05ms        ? ?/sec    1.05      6.1±0.06ms        ? ?/sec
wide_schema/Q02_wide      1.00    893.2±5.93ms        ? ?/sec    1.10    984.1±7.66ms        ? ?/sec
wide_schema/Q03_narrow    1.00     14.5±0.26ms        ? ?/sec    1.02     14.8±0.23ms        ? ?/sec
wide_schema/Q03_wide      1.00    900.3±5.99ms        ? ?/sec    1.11   997.2±11.39ms        ? ?/sec
wide_schema/Q04_narrow    1.00     37.0±0.20ms        ? ?/sec    1.02     37.8±0.23ms        ? ?/sec
wide_schema/Q04_wide      1.00    983.9±3.82ms        ? ?/sec    1.08   1060.0±7.98ms        ? ?/sec

Resource Usage

wide_schema — base (merge-base)

Metric Value
Wall time 980.2s
Peak memory 1.1 GiB
Avg memory 107.6 MiB
CPU user 398.4s
CPU sys 58.8s
Peak spill 0 B

wide_schema — branch

Metric Value
Wall time 980.2s
Peak memory 1.2 GiB
Avg memory 104.2 MiB
CPU user 383.6s
CPU sys 51.3s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.03     84.0±1.81ms        ? ?/sec    1.00     81.3±1.46ms        ? ?/sec
wide_schema/Q01_wide      1.00  1041.9±15.17ms        ? ?/sec    1.07  1112.7±14.96ms        ? ?/sec
wide_schema/Q02_narrow    1.00      6.2±0.11ms        ? ?/sec    1.03      6.4±0.14ms        ? ?/sec
wide_schema/Q02_wide      1.00   921.0±21.09ms        ? ?/sec    1.09  1007.2±20.47ms        ? ?/sec
wide_schema/Q03_narrow    1.00     14.8±0.35ms        ? ?/sec    1.05     15.6±0.17ms        ? ?/sec
wide_schema/Q03_wide      1.00   938.0±23.35ms        ? ?/sec    1.17  1097.5±26.47ms        ? ?/sec
wide_schema/Q04_narrow    1.00     37.8±0.62ms        ? ?/sec    1.04     39.1±0.90ms        ? ?/sec
wide_schema/Q04_wide      1.00    994.7±8.18ms        ? ?/sec    1.09  1080.9±23.55ms        ? ?/sec

Resource Usage

wide_schema — base (merge-base)

Metric Value
Wall time 970.2s
Peak memory 1.1 GiB
Avg memory 97.1 MiB
CPU user 381.4s
CPU sys 52.1s
Peak spill 0 B

wide_schema — branch

Metric Value
Wall time 995.2s
Peak memory 1.1 GiB
Avg memory 109.4 MiB
CPU user 384.8s
CPU sys 51.6s
Peak spill 0 B

File an issue against this benchmark runner

@github-actions github-actions Bot added the common Related to common crate label Jun 26, 2026
@mkleen

mkleen commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Unfortunately we have regressions in the benchmarks:

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.00     80.3±0.44ms        ? ?/sec    1.00     80.5±0.49ms        ? ?/sec
wide_schema/Q01_wide      1.00   1016.6±5.73ms        ? ?/sec    1.07   1085.9±8.10ms        ? ?/sec
wide_schema/Q02_narrow    1.00      5.8±0.05ms        ? ?/sec    1.05      6.1±0.06ms        ? ?/sec
wide_schema/Q02_wide      1.00    893.2±5.93ms        ? ?/sec    1.10    984.1±7.66ms        ? ?/sec
wide_schema/Q03_narrow    1.00     14.5±0.26ms        ? ?/sec    1.02     14.8±0.23ms        ? ?/sec
wide_schema/Q03_wide      1.00    900.3±5.99ms        ? ?/sec    1.11   997.2±11.39ms        ? ?/sec
wide_schema/Q04_narrow    1.00     37.0±0.20ms        ? ?/sec    1.02     37.8±0.23ms        ? ?/sec
wide_schema/Q04_wide      1.00    983.9±3.82ms        ? ?/sec    1.08   1060.0±7.98ms        ? ?/sec

@mkleen

mkleen commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

run benchmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@mkleen

mkleen commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

run benchmark wide_schema

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4810951539-710-d6x86 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (9618f88) to ff677c4 (merge-base) diff using: wide_schema
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4810954479-711-k7mss 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (9618f88) to ff677c4 (merge-base) diff using: wide_schema
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.00     79.8±0.36ms        ? ?/sec    1.02     81.4±0.53ms        ? ?/sec
wide_schema/Q01_wide      1.00   1017.6±3.63ms        ? ?/sec    1.02   1042.3±4.84ms        ? ?/sec
wide_schema/Q02_narrow    1.00      6.2±0.10ms        ? ?/sec    1.00      6.2±0.08ms        ? ?/sec
wide_schema/Q02_wide      1.00    901.5±3.36ms        ? ?/sec    1.02    916.6±6.25ms        ? ?/sec
wide_schema/Q03_narrow    1.00     15.5±0.24ms        ? ?/sec    1.00     15.5±0.26ms        ? ?/sec
wide_schema/Q03_wide      1.00    912.1±5.79ms        ? ?/sec    1.02    933.7±4.82ms        ? ?/sec
wide_schema/Q04_narrow    1.00     37.1±0.12ms        ? ?/sec    1.00     37.1±0.13ms        ? ?/sec
wide_schema/Q04_wide      1.00   1008.7±5.74ms        ? ?/sec    1.02   1029.2±9.65ms        ? ?/sec

Resource Usage

wide_schema — base (merge-base)

Metric Value
Wall time 640.1s
Peak memory 1.2 GiB
Avg memory 152.6 MiB
CPU user 382.9s
CPU sys 52.9s
Peak spill 0 B

wide_schema — branch

Metric Value
Wall time 975.2s
Peak memory 1.1 GiB
Avg memory 104.0 MiB
CPU user 381.7s
CPU sys 52.0s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.01     81.3±0.44ms        ? ?/sec    1.00     80.1±0.38ms        ? ?/sec
wide_schema/Q01_wide      1.00   1033.7±7.03ms        ? ?/sec    1.02   1054.8±6.46ms        ? ?/sec
wide_schema/Q02_narrow    1.01      6.0±0.07ms        ? ?/sec    1.00      6.0±0.08ms        ? ?/sec
wide_schema/Q02_wide      1.00    911.3±4.97ms        ? ?/sec    1.02    933.8±4.85ms        ? ?/sec
wide_schema/Q03_narrow    1.00     15.0±0.30ms        ? ?/sec    1.00     15.1±0.09ms        ? ?/sec
wide_schema/Q03_wide      1.00    923.9±5.68ms        ? ?/sec    1.02    942.4±5.85ms        ? ?/sec
wide_schema/Q04_narrow    1.01     38.0±0.23ms        ? ?/sec    1.00     37.5±0.28ms        ? ?/sec
wide_schema/Q04_wide      1.00   1004.1±5.02ms        ? ?/sec    1.01   1018.3±6.65ms        ? ?/sec

Resource Usage

wide_schema — base (merge-base)

Metric Value
Wall time 645.1s
Peak memory 1.1 GiB
Avg memory 147.8 MiB
CPU user 379.2s
CPU sys 53.5s
Peak spill 0 B

wide_schema — branch

Metric Value
Wall time 960.2s
Peak memory 1.1 GiB
Avg memory 100.0 MiB
CPU user 384.2s
CPU sys 52.7s
Peak spill 0 B

File an issue against this benchmark runner

File statistics are computed against a specific `file_schema`, but the
file-statistics cache was keyed only by table and path. Reading the same path
under a different schema could reuse statistics whose `column_statistics` no
longer line up, panicking during statistics projection. apache#22950 worked around
this by bypassing the cache entirely for anonymous explicit-schema reads, at
the cost of losing cache reuse for them.

Introduce a `SchemaFingerprint` (per-column name, type and nullability, derived
from `file_schema`) and a `FileStatisticsCacheKey { table, path, schema }`, and
key the file-statistics cache on it. Different schemas now get distinct entries
(no stale cross-schema reuse) while a repeated read of the same schema reuses
its entry, so the apache#22950 bypass is removed and anonymous explicit-schema reads
cache safely again.

- The fingerprint excludes field/schema metadata (cannot affect statistics) and
  partition columns (their statistics are computed separately).
- Table-drop invalidation is unchanged: drop_table_entries matches on
  CacheKey::table_ref(), which still returns the table, so all schema variants
  for a table are removed together.
- The list-files cache continues to key on TableScopedPath.

Closes apache#23072.
Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
…apSize

Add a `DFHeapSize` impl for 3-tuples (mirroring the existing 2-tuple one) so
`Vec<(String, DataType, bool)>` accounts for its heap automatically, letting
`SchemaFingerprint::heap_size` delegate to it instead of computing the size by
hand. Also update the `test_statistics_cache` unit test to key on
`FileStatisticsCacheKey` so it matches the real file-statistics cache.

Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
`do_collect_statistics_and_ordering` rebuilt the `SchemaFingerprint` for every
file, deep-cloning all column names and types — O(files x schema width) of
redundant work, since `file_schema` is constant for a table.

Compute the fingerprint once in `ListingTable::try_new` and store it as
`Arc<SchemaFingerprint>`; `FileStatisticsCacheKey.schema` now holds the `Arc`, so
building a key per file is an O(1) refcount bump instead of a deep clone. `Arc`'s
`Eq`/`Hash` compare the inner value, so cache keying remains by schema contents.

Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

run benchmark clickbench_partitioned

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4815639024-719-hqn46 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (9618f88) to ff677c4 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Since it's an api-change, I think it makes sense to add an entry to the upgrade guide in https://github.com/apache/datafusion/blob/main/docs/source/library-user-guide/upgrading/55.0.0.md.

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and issue_23072
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                                issue_23072 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.27 / 4.21 ±5.73 / 15.66 ms │               2.09 / 5.10 ±5.82 / 16.74 ms │  1.21x slower │
│ QQuery 1  │             12.83 / 13.19 ±0.20 / 13.46 ms │             13.60 / 14.08 ±0.29 / 14.37 ms │  1.07x slower │
│ QQuery 2  │             37.39 / 38.20 ±0.62 / 39.26 ms │             37.28 / 37.65 ±0.24 / 37.96 ms │     no change │
│ QQuery 3  │             31.26 / 32.27 ±0.89 / 33.32 ms │             32.03 / 32.76 ±0.53 / 33.65 ms │     no change │
│ QQuery 4  │      2396.78 / 2446.78 ±34.33 / 2501.06 ms │      2511.52 / 2557.86 ±45.96 / 2613.87 ms │     no change │
│ QQuery 5  │     2579.67 / 2716.83 ±108.26 / 2879.29 ms │      2677.81 / 2756.10 ±51.52 / 2816.00 ms │     no change │
│ QQuery 6  │                1.27 / 1.42 ±0.23 / 1.87 ms │                2.26 / 2.38 ±0.18 / 2.74 ms │  1.68x slower │
│ QQuery 7  │             14.33 / 14.66 ±0.31 / 15.23 ms │             15.15 / 15.35 ±0.18 / 15.58 ms │     no change │
│ QQuery 8  │      2868.43 / 2918.46 ±42.03 / 2984.91 ms │      3040.29 / 3134.83 ±73.56 / 3254.73 ms │  1.07x slower │
│ QQuery 9  │         524.57 / 553.89 ±26.30 / 588.86 ms │         552.10 / 578.26 ±26.04 / 622.81 ms │     no change │
│ QQuery 10 │           83.93 / 90.82 ±11.14 / 113.00 ms │            86.24 / 92.32 ±8.48 / 108.84 ms │     no change │
│ QQuery 11 │            97.95 / 99.32 ±1.22 / 101.54 ms │           98.04 / 100.04 ±1.34 / 102.25 ms │     no change │
│ QQuery 12 │      2594.65 / 2682.75 ±75.99 / 2793.64 ms │     2683.30 / 2973.24 ±244.86 / 3365.90 ms │  1.11x slower │
│ QQuery 13 │      1757.08 / 1893.89 ±81.54 / 1986.39 ms │     1819.91 / 1980.12 ±160.08 / 2273.46 ms │     no change │
│ QQuery 14 │         735.09 / 766.41 ±27.14 / 801.26 ms │         763.91 / 814.80 ±42.49 / 870.49 ms │  1.06x slower │
│ QQuery 15 │      2740.04 / 2823.13 ±57.52 / 2916.29 ms │      2859.97 / 2943.72 ±55.42 / 3025.79 ms │     no change │
│ QQuery 16 │     7103.71 / 7252.21 ±110.77 / 7385.29 ms │     7302.71 / 7488.49 ±134.84 / 7673.51 ms │     no change │
│ QQuery 17 │     4175.88 / 4439.28 ±199.03 / 4760.40 ms │     4379.11 / 4564.36 ±172.58 / 4791.38 ms │     no change │
│ QQuery 18 │  32949.27 / 33510.29 ±449.55 / 34078.78 ms │  33627.51 / 34463.96 ±507.99 / 35168.97 ms │     no change │
│ QQuery 19 │             29.03 / 30.37 ±1.51 / 33.25 ms │             30.17 / 33.41 ±3.71 / 40.27 ms │  1.10x slower │
│ QQuery 20 │         519.38 / 530.58 ±12.84 / 552.12 ms │          519.54 / 530.94 ±9.64 / 543.39 ms │     no change │
│ QQuery 21 │          520.74 / 526.77 ±5.19 / 536.11 ms │          531.18 / 540.01 ±6.56 / 550.65 ms │     no change │
│ QQuery 22 │         996.40 / 999.70 ±2.58 / 1003.89 ms │       1014.53 / 1020.94 ±3.94 / 1025.94 ms │     no change │
│ QQuery 23 │      3129.71 / 3170.77 ±44.12 / 3254.80 ms │      3196.79 / 3245.63 ±32.98 / 3299.78 ms │     no change │
│ QQuery 24 │             41.74 / 42.28 ±0.62 / 43.46 ms │           43.57 / 59.05 ±24.64 / 108.08 ms │  1.40x slower │
│ QQuery 25 │          112.16 / 113.18 ±0.93 / 114.77 ms │          115.25 / 116.85 ±2.04 / 120.56 ms │     no change │
│ QQuery 26 │             43.05 / 47.61 ±6.14 / 59.49 ms │             44.09 / 45.34 ±1.89 / 49.09 ms │     no change │
│ QQuery 27 │          674.40 / 680.94 ±3.65 / 685.43 ms │          685.56 / 692.80 ±6.96 / 704.64 ms │     no change │
│ QQuery 28 │      3796.72 / 3870.50 ±91.72 / 4049.17 ms │     3820.93 / 4001.39 ±191.44 / 4343.94 ms │     no change │
│ QQuery 29 │           41.42 / 74.51 ±44.29 / 159.69 ms │           42.34 / 62.78 ±40.28 / 143.35 ms │ +1.19x faster │
│ QQuery 30 │         732.09 / 743.54 ±10.57 / 760.43 ms │         749.68 / 772.33 ±21.47 / 806.91 ms │     no change │
│ QQuery 31 │      1066.94 / 1085.74 ±12.65 / 1097.73 ms │       1115.07 / 1123.55 ±7.19 / 1135.75 ms │     no change │
│ QQuery 32 │  42863.32 / 42973.05 ±129.72 / 43223.45 ms │  44836.73 / 45143.94 ±220.10 / 45462.45 ms │  1.05x slower │
│ QQuery 33 │ 39609.76 / 42185.00 ±1931.20 / 45570.71 ms │ 45208.73 / 46353.21 ±1851.26 / 50039.97 ms │  1.10x slower │
│ QQuery 34 │ 42296.12 / 45147.79 ±2975.64 / 50712.18 ms │ 43949.49 / 45575.42 ±1672.03 / 48402.63 ms │     no change │
│ QQuery 35 │      1277.61 / 1334.39 ±39.40 / 1393.17 ms │      1263.36 / 1295.74 ±26.56 / 1340.65 ms │     no change │
│ QQuery 36 │          189.99 / 195.15 ±4.15 / 202.72 ms │          178.12 / 189.18 ±6.57 / 198.10 ms │     no change │
│ QQuery 37 │             40.76 / 46.15 ±3.81 / 51.19 ms │            38.57 / 47.86 ±12.04 / 70.19 ms │     no change │
│ QQuery 38 │             42.87 / 45.93 ±2.22 / 48.54 ms │             45.61 / 47.78 ±3.53 / 54.83 ms │     no change │
│ QQuery 39 │          193.66 / 211.45 ±8.98 / 217.28 ms │          184.67 / 197.79 ±7.99 / 209.55 ms │ +1.07x faster │
│ QQuery 40 │             15.71 / 18.37 ±4.80 / 27.96 ms │             15.54 / 15.67 ±0.10 / 15.82 ms │ +1.17x faster │
│ QQuery 41 │             14.72 / 15.02 ±0.40 / 15.81 ms │             15.35 / 17.17 ±3.53 / 24.23 ms │  1.14x slower │
│ QQuery 42 │             14.33 / 14.56 ±0.18 / 14.83 ms │             14.57 / 14.87 ±0.41 / 15.68 ms │     no change │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 206401.36ms │
│ Total Time (issue_23072)   │ 215699.09ms │
│ Average Time (HEAD)        │   4800.03ms │
│ Average Time (issue_23072) │   5016.26ms │
│ Queries Faster             │           3 │
│ Queries Slower             │          11 │
│ Queries with No Change     │          29 │
│ Queries with Failure       │           0 │
└────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 1035.2s
Peak memory 12.1 GiB
Avg memory 6.6 GiB
CPU user 10526.3s
CPU sys 497.5s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 1080.2s
Peak memory 12.5 GiB
Avg memory 6.5 GiB
CPU user 10925.6s
CPU sys 523.7s
Peak spill 0 B

File an issue against this benchmark runner

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

run benchmark clickbench_partitioned

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4815807773-720-pdt48 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (9618f88) to ff677c4 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and issue_23072
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                                issue_23072 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.23 / 3.91 ±5.32 / 14.55 ms │               2.06 / 4.84 ±5.53 / 15.91 ms │  1.24x slower │
│ QQuery 1  │             12.46 / 12.70 ±0.20 / 13.05 ms │             13.84 / 13.97 ±0.13 / 14.21 ms │  1.10x slower │
│ QQuery 2  │             35.39 / 35.77 ±0.28 / 36.09 ms │             36.73 / 36.95 ±0.26 / 37.43 ms │     no change │
│ QQuery 3  │             30.25 / 31.28 ±1.09 / 32.98 ms │             31.15 / 31.41 ±0.19 / 31.68 ms │     no change │
│ QQuery 4  │      2321.80 / 2382.35 ±45.86 / 2443.98 ms │      2375.47 / 2416.65 ±25.83 / 2445.94 ms │     no change │
│ QQuery 5  │      2405.37 / 2510.16 ±88.33 / 2667.59 ms │      2456.23 / 2571.81 ±92.71 / 2706.94 ms │     no change │
│ QQuery 6  │                1.28 / 1.44 ±0.24 / 1.91 ms │                2.10 / 2.28 ±0.21 / 2.68 ms │  1.59x slower │
│ QQuery 7  │             13.55 / 13.73 ±0.12 / 13.89 ms │             15.04 / 15.13 ±0.07 / 15.22 ms │  1.10x slower │
│ QQuery 8  │      2898.72 / 2945.38 ±29.51 / 2984.70 ms │      2872.18 / 2924.64 ±47.76 / 2990.30 ms │     no change │
│ QQuery 9  │          477.33 / 486.63 ±6.30 / 494.91 ms │         469.56 / 505.59 ±33.43 / 564.13 ms │     no change │
│ QQuery 10 │           79.58 / 85.09 ±10.00 / 105.08 ms │             82.33 / 84.21 ±1.81 / 87.50 ms │     no change │
│ QQuery 11 │             92.91 / 93.99 ±0.97 / 95.35 ms │          94.36 / 108.86 ±24.16 / 157.09 ms │  1.16x slower │
│ QQuery 12 │      2484.33 / 2584.11 ±67.78 / 2691.02 ms │      2533.80 / 2592.58 ±48.37 / 2664.00 ms │     no change │
│ QQuery 13 │     1682.99 / 1896.59 ±145.53 / 2062.33 ms │     1810.25 / 1915.01 ±101.64 / 2106.05 ms │     no change │
│ QQuery 14 │         711.17 / 733.92 ±25.67 / 783.57 ms │         718.86 / 746.27 ±21.38 / 781.23 ms │     no change │
│ QQuery 15 │      2698.82 / 2774.11 ±72.62 / 2893.63 ms │      2638.15 / 2708.72 ±52.77 / 2771.11 ms │     no change │
│ QQuery 16 │     7005.19 / 7175.60 ±145.30 / 7431.25 ms │     6991.35 / 7154.71 ±134.41 / 7341.34 ms │     no change │
│ QQuery 17 │     4231.24 / 4362.66 ±105.91 / 4539.88 ms │     4189.45 / 4420.81 ±175.20 / 4648.30 ms │     no change │
│ QQuery 18 │  31716.39 / 32375.43 ±457.46 / 32814.30 ms │  32176.69 / 32822.00 ±354.19 / 33195.45 ms │     no change │
│ QQuery 19 │             27.93 / 29.35 ±1.80 / 32.86 ms │             29.05 / 30.86 ±1.69 / 33.72 ms │  1.05x slower │
│ QQuery 20 │         510.40 / 522.16 ±12.73 / 546.82 ms │          513.23 / 519.47 ±6.54 / 530.47 ms │     no change │
│ QQuery 21 │         506.71 / 523.01 ±10.04 / 532.79 ms │          512.52 / 522.52 ±5.09 / 526.44 ms │     no change │
│ QQuery 22 │          978.83 / 987.24 ±7.43 / 996.13 ms │          971.58 / 983.45 ±9.30 / 997.95 ms │     no change │
│ QQuery 23 │      3026.83 / 3054.41 ±16.09 / 3073.14 ms │      2999.82 / 3020.83 ±20.49 / 3045.84 ms │     no change │
│ QQuery 24 │             41.05 / 41.20 ±0.19 / 41.57 ms │             41.77 / 42.10 ±0.22 / 42.47 ms │     no change │
│ QQuery 25 │         110.49 / 120.39 ±13.50 / 146.29 ms │          110.49 / 114.47 ±4.24 / 120.72 ms │     no change │
│ QQuery 26 │             41.62 / 42.53 ±0.84 / 43.68 ms │             42.76 / 43.74 ±0.87 / 45.29 ms │     no change │
│ QQuery 27 │          673.33 / 682.88 ±9.75 / 700.33 ms │          660.70 / 671.60 ±6.16 / 679.32 ms │     no change │
│ QQuery 28 │     3744.53 / 3961.80 ±146.97 / 4088.09 ms │     3652.60 / 3791.29 ±168.07 / 4117.72 ms │     no change │
│ QQuery 29 │             41.27 / 47.53 ±6.84 / 57.42 ms │             40.67 / 41.27 ±0.49 / 42.10 ms │ +1.15x faster │
│ QQuery 30 │          706.57 / 716.27 ±8.06 / 730.86 ms │         703.71 / 719.35 ±18.33 / 745.39 ms │     no change │
│ QQuery 31 │      1028.11 / 1066.45 ±25.97 / 1107.54 ms │      1039.03 / 1059.61 ±16.72 / 1089.60 ms │     no change │
│ QQuery 32 │  41749.98 / 41923.06 ±137.19 / 42115.99 ms │  41933.86 / 42059.36 ±125.78 / 42284.54 ms │     no change │
│ QQuery 33 │  39816.33 / 41118.42 ±878.48 / 42505.38 ms │ 39357.48 / 42890.32 ±2127.56 / 45949.01 ms │     no change │
│ QQuery 34 │ 39436.50 / 41579.73 ±1727.79 / 44503.61 ms │ 39569.80 / 41278.39 ±1576.87 / 43739.73 ms │     no change │
│ QQuery 35 │      1231.58 / 1249.08 ±18.53 / 1284.93 ms │      1213.77 / 1242.10 ±20.94 / 1267.65 ms │     no change │
│ QQuery 36 │         163.03 / 176.19 ±14.58 / 203.20 ms │         150.77 / 181.22 ±23.77 / 208.57 ms │     no change │
│ QQuery 37 │             37.59 / 44.05 ±9.64 / 63.04 ms │             37.68 / 42.33 ±4.95 / 51.74 ms │     no change │
│ QQuery 38 │             42.01 / 43.00 ±0.94 / 44.14 ms │             44.02 / 47.80 ±4.18 / 53.92 ms │  1.11x slower │
│ QQuery 39 │          179.30 / 185.76 ±4.10 / 190.28 ms │          183.11 / 193.03 ±9.14 / 208.18 ms │     no change │
│ QQuery 40 │             14.11 / 17.81 ±5.76 / 29.27 ms │             15.19 / 15.73 ±0.46 / 16.54 ms │ +1.13x faster │
│ QQuery 41 │             13.36 / 16.74 ±6.37 / 29.47 ms │             14.32 / 15.79 ±2.06 / 19.82 ms │ +1.06x faster │
│ QQuery 42 │             13.17 / 13.44 ±0.22 / 13.83 ms │             13.89 / 14.94 ±1.76 / 18.44 ms │  1.11x slower │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 198667.33ms │
│ Total Time (issue_23072)   │ 200618.03ms │
│ Average Time (HEAD)        │   4620.17ms │
│ Average Time (issue_23072) │   4665.54ms │
│ Queries Faster             │           3 │
│ Queries Slower             │           8 │
│ Queries with No Change     │          32 │
│ Queries with Failure       │           0 │
└────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 995.2s
Peak memory 12.2 GiB
Avg memory 6.5 GiB
CPU user 10131.9s
CPU sys 464.0s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 1005.2s
Peak memory 12.5 GiB
Avg memory 6.6 GiB
CPU user 10156.9s
CPU sys 463.8s
Peak spill 0 B

File an issue against this benchmark runner

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

It looks like there are regressions in the clickbench partitioned benchmark. In particular Query 6 is reproducibly slower.

│ QQuery 6  │                1.28 / 1.44 ±0.24 / 1.91 ms │                2.10 / 2.28 ±0.21 / 2.68 ms │  1.59x slower │

// fingerprint: reads of the same path under a different schema get
// their own entry rather than reusing incompatible column statistics.
// The fingerprint is precomputed once per table (see `try_new`).
schema: Arc::clone(&self.file_schema_fingerprint),

@mkleen mkleen Jun 27, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precalculating the hash for the fingerprint could be a solution to fix the regression. Right now we calculate the hash for the fingerprint for each entry which is expensive.

@Phoenix500526 Phoenix500526 Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. I've precalcuated hash for the fingerprint. It seems that only whitelisted users can trigger benchmark. Could you help trigger one?

Hashing a FileStatisticsCacheKey on every cache lookup previously digested
the entire file schema (O(schema width)). Store a fixed-seed hash of the
fingerprint columns, computed once in from_schema, and feed only that u64
into the map hasher. PartialEq still compares the columns exactly, so a hash
collision can never make two different schemas share a cache entry.

Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
@Phoenix500526

Copy link
Copy Markdown
Contributor Author

run benchmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@adriangbot

Copy link
Copy Markdown

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

run benchmark clickbench_partitioned

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4817989936-722-ldxrq 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (a767586) to d58e0c6 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and issue_23072
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                                issue_23072 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.21 / 4.02 ±5.51 / 15.04 ms │               1.24 / 3.99 ±5.44 / 14.87 ms │     no change │
│ QQuery 1  │             13.26 / 13.44 ±0.13 / 13.60 ms │             13.16 / 13.41 ±0.15 / 13.59 ms │     no change │
│ QQuery 2  │             36.14 / 36.40 ±0.19 / 36.65 ms │             36.05 / 36.30 ±0.19 / 36.64 ms │     no change │
│ QQuery 3  │             30.79 / 31.45 ±0.70 / 32.77 ms │             30.69 / 31.25 ±0.65 / 32.51 ms │     no change │
│ QQuery 4  │      1713.06 / 1748.11 ±37.79 / 1799.01 ms │      1669.74 / 1749.10 ±55.76 / 1820.04 ms │     no change │
│ QQuery 5  │      1655.38 / 1752.29 ±81.29 / 1893.62 ms │     1635.63 / 1771.16 ±108.55 / 1954.22 ms │     no change │
│ QQuery 6  │                1.31 / 1.47 ±0.24 / 1.95 ms │                1.29 / 1.44 ±0.24 / 1.92 ms │     no change │
│ QQuery 7  │             14.83 / 14.96 ±0.12 / 15.17 ms │             14.58 / 14.69 ±0.09 / 14.79 ms │     no change │
│ QQuery 8  │      1952.39 / 2106.57 ±86.64 / 2204.21 ms │      1954.97 / 2090.42 ±92.39 / 2239.01 ms │     no change │
│ QQuery 9  │         474.26 / 490.31 ±15.09 / 513.09 ms │         480.49 / 508.52 ±16.79 / 531.28 ms │     no change │
│ QQuery 10 │             76.69 / 77.74 ±0.59 / 78.51 ms │             77.98 / 78.66 ±0.50 / 79.20 ms │     no change │
│ QQuery 11 │             87.87 / 90.90 ±3.85 / 98.50 ms │          89.37 / 111.46 ±37.72 / 186.77 ms │  1.23x slower │
│ QQuery 12 │      1651.47 / 1793.69 ±84.26 / 1876.07 ms │     1648.31 / 1835.06 ±130.28 / 1959.30 ms │     no change │
│ QQuery 13 │        475.60 / 630.23 ±142.13 / 875.61 ms │        461.54 / 664.70 ±167.50 / 849.37 ms │  1.05x slower │
│ QQuery 14 │         538.59 / 558.95 ±16.20 / 579.95 ms │         547.71 / 559.45 ±12.99 / 582.96 ms │     no change │
│ QQuery 15 │      1938.10 / 2015.97 ±46.42 / 2063.88 ms │      1917.02 / 1971.30 ±30.30 / 2004.56 ms │     no change │
│ QQuery 16 │     4265.04 / 4373.62 ±105.08 / 4502.63 ms │     3993.78 / 4259.63 ±178.39 / 4536.64 ms │     no change │
│ QQuery 17 │     4227.55 / 4465.40 ±192.18 / 4788.48 ms │     4194.17 / 4380.55 ±127.20 / 4541.17 ms │     no change │
│ QQuery 18 │  18037.75 / 18449.77 ±423.09 / 19035.43 ms │  17687.91 / 18461.00 ±482.94 / 19170.43 ms │     no change │
│ QQuery 19 │             28.78 / 35.39 ±9.26 / 53.72 ms │             28.03 / 28.88 ±0.82 / 30.28 ms │ +1.23x faster │
│ QQuery 20 │          510.09 / 519.72 ±8.85 / 535.95 ms │          514.94 / 520.13 ±4.33 / 527.17 ms │     no change │
│ QQuery 21 │          518.16 / 524.26 ±3.84 / 528.50 ms │         515.89 / 526.89 ±13.45 / 552.89 ms │     no change │
│ QQuery 22 │       985.52 / 1009.99 ±19.19 / 1039.36 ms │       975.21 / 1015.01 ±21.20 / 1037.17 ms │     no change │
│ QQuery 23 │      3051.84 / 3106.93 ±43.61 / 3163.82 ms │      3062.57 / 3107.56 ±46.75 / 3196.77 ms │     no change │
│ QQuery 24 │           41.30 / 60.38 ±22.73 / 103.63 ms │             41.30 / 44.34 ±5.10 / 54.47 ms │ +1.36x faster │
│ QQuery 25 │          110.88 / 111.83 ±0.82 / 113.24 ms │          112.05 / 114.59 ±2.42 / 119.05 ms │     no change │
│ QQuery 26 │             41.76 / 43.21 ±1.99 / 47.14 ms │             41.75 / 43.78 ±2.57 / 48.01 ms │     no change │
│ QQuery 27 │          663.17 / 675.06 ±6.50 / 682.76 ms │          671.21 / 675.06 ±2.72 / 678.54 ms │     no change │
│ QQuery 28 │     3465.46 / 3683.42 ±218.83 / 3986.37 ms │     3539.82 / 3768.05 ±146.80 / 3958.10 ms │     no change │
│ QQuery 29 │            40.52 / 53.27 ±19.80 / 91.65 ms │           40.62 / 61.09 ±33.97 / 128.15 ms │  1.15x slower │
│ QQuery 30 │         560.36 / 580.74 ±23.01 / 622.42 ms │         576.63 / 613.92 ±25.44 / 648.50 ms │  1.06x slower │
│ QQuery 31 │          309.46 / 316.17 ±4.89 / 323.02 ms │          300.34 / 310.42 ±6.12 / 319.60 ms │     no change │
│ QQuery 32 │        931.78 / 981.57 ±35.52 / 1022.67 ms │       969.52 / 1025.37 ±42.11 / 1064.17 ms │     no change │
│ QQuery 33 │ 27057.73 / 29042.70 ±1373.79 / 30626.86 ms │ 26189.33 / 27788.74 ±1211.16 / 29856.48 ms │     no change │
│ QQuery 34 │ 27351.84 / 29752.94 ±1814.65 / 31905.04 ms │  27295.07 / 27883.56 ±349.03 / 28308.45 ms │ +1.07x faster │
│ QQuery 35 │      985.27 / 1104.48 ±137.47 / 1364.30 ms │     1103.69 / 1188.79 ±127.72 / 1439.75 ms │  1.08x slower │
│ QQuery 36 │          160.15 / 168.59 ±4.79 / 174.21 ms │          170.86 / 173.74 ±2.60 / 176.90 ms │     no change │
│ QQuery 37 │            36.71 / 49.70 ±24.94 / 99.58 ms │           37.51 / 57.33 ±32.57 / 122.14 ms │  1.15x slower │
│ QQuery 38 │             42.53 / 45.67 ±1.87 / 47.62 ms │             41.42 / 43.78 ±1.58 / 45.44 ms │     no change │
│ QQuery 39 │         183.22 / 200.33 ±20.75 / 241.11 ms │          173.70 / 181.10 ±5.49 / 188.61 ms │ +1.11x faster │
│ QQuery 40 │             14.60 / 15.42 ±0.72 / 16.48 ms │             14.74 / 15.78 ±0.92 / 17.13 ms │     no change │
│ QQuery 41 │             13.89 / 14.20 ±0.37 / 14.90 ms │             13.74 / 13.98 ±0.22 / 14.40 ms │     no change │
│ QQuery 42 │             13.19 / 13.44 ±0.14 / 13.60 ms │             13.17 / 13.43 ±0.19 / 13.64 ms │     no change │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 110764.74ms │
│ Total Time (issue_23072)   │ 107757.45ms │
│ Average Time (HEAD)        │   2575.92ms │
│ Average Time (issue_23072) │   2505.99ms │
│ Queries Faster             │           4 │
│ Queries Slower             │           6 │
│ Queries with No Change     │          33 │
│ Queries with Failure       │           0 │
└────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 555.1s
Peak memory 11.8 GiB
Avg memory 6.5 GiB
CPU user 4868.3s
CPU sys 326.8s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 540.1s
Peak memory 11.9 GiB
Avg memory 6.5 GiB
CPU user 4904.6s
CPU sys 325.4s
Peak spill 0 B

File an issue against this benchmark runner

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

run benchmark clickbench_partitioned

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

run benchmark wide_schema

baseline:
  env:
    DATAFUSION_RUNTIME_FILE_STATISTICS_CACHE_LIMIT: 0

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4818323559-723-tm7xw 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (a767586) to d58e0c6 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4818326429-724-hr9kq 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue/23072 (a767586) to d58e0c6 (merge-base) diff using: wide_schema
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and issue_23072
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃                               issue_23072 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.19 / 3.96 ±5.43 / 14.81 ms │              1.26 / 4.13 ±5.61 / 15.35 ms │     no change │
│ QQuery 1  │             12.70 / 13.07 ±0.19 / 13.23 ms │            12.60 / 12.84 ±0.15 / 13.03 ms │     no change │
│ QQuery 2  │             35.70 / 36.00 ±0.23 / 36.32 ms │            35.95 / 36.22 ±0.24 / 36.62 ms │     no change │
│ QQuery 3  │             30.64 / 31.21 ±0.66 / 32.49 ms │            30.62 / 30.97 ±0.40 / 31.74 ms │     no change │
│ QQuery 4  │      1723.05 / 1769.76 ±54.07 / 1874.25 ms │     1658.60 / 1718.31 ±48.48 / 1790.59 ms │     no change │
│ QQuery 5  │     1610.36 / 1784.19 ±130.80 / 2000.40 ms │     1725.56 / 1884.15 ±99.56 / 2025.24 ms │  1.06x slower │
│ QQuery 6  │                1.23 / 1.40 ±0.24 / 1.86 ms │               1.27 / 1.42 ±0.24 / 1.90 ms │     no change │
│ QQuery 7  │             13.73 / 14.07 ±0.26 / 14.50 ms │            13.93 / 13.96 ±0.02 / 13.99 ms │     no change │
│ QQuery 8  │      2094.74 / 2156.23 ±31.47 / 2182.63 ms │     1915.68 / 2035.57 ±65.72 / 2103.57 ms │ +1.06x faster │
│ QQuery 9  │         476.16 / 499.44 ±20.23 / 531.57 ms │        482.64 / 517.78 ±23.15 / 548.11 ms │     no change │
│ QQuery 10 │             75.45 / 79.13 ±3.09 / 83.89 ms │          79.04 / 99.08 ±34.89 / 168.79 ms │  1.25x slower │
│ QQuery 11 │             86.55 / 89.01 ±1.60 / 90.78 ms │            90.76 / 93.20 ±1.93 / 95.32 ms │     no change │
│ QQuery 12 │     1644.75 / 1863.60 ±160.91 / 2100.36 ms │    1643.40 / 1804.01 ±138.32 / 2052.44 ms │     no change │
│ QQuery 13 │        464.22 / 637.70 ±124.33 / 852.83 ms │        627.58 / 659.55 ±27.41 / 690.79 ms │     no change │
│ QQuery 14 │          533.73 / 550.37 ±9.57 / 559.76 ms │        531.28 / 558.96 ±18.00 / 587.15 ms │     no change │
│ QQuery 15 │      1883.19 / 1972.22 ±51.93 / 2044.40 ms │     1932.60 / 2002.46 ±74.92 / 2137.80 ms │     no change │
│ QQuery 16 │     4181.86 / 4375.49 ±110.96 / 4515.96 ms │    4173.89 / 4409.59 ±208.11 / 4750.03 ms │     no change │
│ QQuery 17 │     4196.87 / 4413.93 ±137.96 / 4596.62 ms │    4273.16 / 4385.26 ±119.00 / 4604.14 ms │     no change │
│ QQuery 18 │  17773.79 / 18331.92 ±364.43 / 18905.62 ms │ 17753.89 / 18348.25 ±444.69 / 19093.57 ms │     no change │
│ QQuery 19 │             28.14 / 30.48 ±2.17 / 34.11 ms │            28.77 / 29.16 ±0.61 / 30.38 ms │     no change │
│ QQuery 20 │         518.86 / 527.21 ±10.27 / 546.78 ms │        517.59 / 543.22 ±43.47 / 629.88 ms │     no change │
│ QQuery 21 │          514.19 / 521.49 ±4.24 / 525.06 ms │         515.01 / 522.55 ±5.10 / 529.03 ms │     no change │
│ QQuery 22 │      1000.68 / 1022.29 ±14.98 / 1046.99 ms │        982.28 / 993.22 ±7.23 / 1000.28 ms │     no change │
│ QQuery 23 │      3041.81 / 3087.82 ±44.25 / 3170.73 ms │     3058.36 / 3116.24 ±37.16 / 3151.32 ms │     no change │
│ QQuery 24 │            42.95 / 55.32 ±14.87 / 82.96 ms │           43.06 / 55.77 ±12.94 / 72.47 ms │     no change │
│ QQuery 25 │          114.17 / 115.36 ±1.03 / 116.68 ms │         113.96 / 115.20 ±0.86 / 116.20 ms │     no change │
│ QQuery 26 │             42.70 / 44.14 ±1.93 / 47.92 ms │            43.68 / 45.74 ±2.32 / 49.01 ms │     no change │
│ QQuery 27 │          669.62 / 677.41 ±5.40 / 685.56 ms │        685.30 / 698.02 ±12.50 / 720.70 ms │     no change │
│ QQuery 28 │     3382.32 / 3681.90 ±258.18 / 3998.63 ms │     3532.26 / 3662.50 ±91.86 / 3806.84 ms │     no change │
│ QQuery 29 │             40.31 / 41.35 ±1.58 / 44.49 ms │            40.26 / 42.20 ±2.84 / 47.78 ms │     no change │
│ QQuery 30 │         555.91 / 572.68 ±11.58 / 591.23 ms │        558.19 / 577.16 ±14.57 / 599.23 ms │     no change │
│ QQuery 31 │         303.28 / 317.56 ±10.24 / 326.83 ms │         293.28 / 305.77 ±8.84 / 317.74 ms │     no change │
│ QQuery 32 │       951.69 / 1026.17 ±46.81 / 1090.00 ms │     1012.26 / 1041.62 ±30.09 / 1094.34 ms │     no change │
│ QQuery 33 │ 26933.56 / 29239.94 ±2140.91 / 32883.42 ms │ 27092.37 / 28407.79 ±776.99 / 29338.29 ms │     no change │
│ QQuery 34 │  26890.68 / 28453.87 ±948.94 / 29728.19 ms │ 28303.80 / 29368.10 ±732.73 / 30341.98 ms │     no change │
│ QQuery 35 │      994.33 / 1188.84 ±178.54 / 1467.40 ms │      990.29 / 1060.95 ±36.97 / 1094.63 ms │ +1.12x faster │
│ QQuery 36 │          172.41 / 174.45 ±2.44 / 178.37 ms │        154.89 / 167.03 ±11.57 / 188.05 ms │     no change │
│ QQuery 37 │           38.74 / 60.94 ±38.56 / 137.91 ms │           37.55 / 48.54 ±17.72 / 83.75 ms │ +1.26x faster │
│ QQuery 38 │             44.30 / 46.62 ±1.34 / 47.98 ms │            40.48 / 43.64 ±1.85 / 45.80 ms │ +1.07x faster │
│ QQuery 39 │         198.52 / 213.46 ±10.88 / 225.19 ms │        182.72 / 197.23 ±18.06 / 232.81 ms │ +1.08x faster │
│ QQuery 40 │             15.10 / 15.78 ±0.44 / 16.44 ms │            14.59 / 16.19 ±2.84 / 21.86 ms │     no change │
│ QQuery 41 │             14.69 / 14.99 ±0.23 / 15.35 ms │           14.01 / 20.30 ±12.32 / 44.93 ms │  1.35x slower │
│ QQuery 42 │             14.23 / 14.38 ±0.11 / 14.55 ms │            13.39 / 13.69 ±0.23 / 14.05 ms │     no change │
└───────────┴────────────────────────────────────────────┴───────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 109767.14ms │
│ Total Time (issue_23072)   │ 109707.57ms │
│ Average Time (HEAD)        │   2552.72ms │
│ Average Time (issue_23072) │   2551.34ms │
│ Queries Faster             │           5 │
│ Queries Slower             │           3 │
│ Queries with No Change     │          35 │
│ Queries with Failure       │           0 │
└────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 550.1s
Peak memory 12.0 GiB
Avg memory 6.5 GiB
CPU user 4880.8s
CPU sys 337.1s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 550.1s
Peak memory 12.1 GiB
Avg memory 6.5 GiB
CPU user 4888.1s
CPU sys 331.1s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                     HEAD                                   issue_23072
-----                     ----                                   -----------
wide_schema/Q01_narrow    1.00     79.8±0.51ms        ? ?/sec    1.00     79.9±0.26ms        ? ?/sec
wide_schema/Q01_wide      1.01   1016.5±7.03ms        ? ?/sec    1.00   1009.8±7.68ms        ? ?/sec
wide_schema/Q02_narrow    1.00      5.9±0.10ms        ? ?/sec    1.00      5.9±0.10ms        ? ?/sec
wide_schema/Q02_wide      1.03    905.4±2.97ms        ? ?/sec    1.00    881.9±2.95ms        ? ?/sec
wide_schema/Q03_narrow    1.00     14.9±0.18ms        ? ?/sec    1.01     15.0±0.29ms        ? ?/sec
wide_schema/Q03_wide      1.01    907.8±2.88ms        ? ?/sec    1.00    896.6±4.89ms        ? ?/sec
wide_schema/Q04_narrow    1.00     36.9±0.16ms        ? ?/sec    1.01     37.1±0.18ms        ? ?/sec
wide_schema/Q04_wide      1.02    998.9±7.60ms        ? ?/sec    1.00    979.0±5.84ms        ? ?/sec

Resource Usage

wide_schema — base (merge-base)

Metric Value
Wall time 925.2s
Peak memory 1.2 GiB
Avg memory 104.4 MiB
CPU user 389.8s
CPU sys 53.2s
Peak spill 0 B

wide_schema — branch

Metric Value
Wall time 915.2s
Peak memory 1.2 GiB
Avg memory 109.0 MiB
CPU user 382.9s
CPU sys 51.5s
Peak spill 0 B

File an issue against this benchmark runner

…uide

The file-statistics cache key changed from TableScopedPath to a schema-aware
FileStatisticsCacheKey. Add an upgrade-guide entry covering the type-alias
change and how to migrate custom cache implementations and direct get/put
callers.

Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 27, 2026
@Phoenix500526

Copy link
Copy Markdown
Contributor Author

Since it's an api-change, I think it makes sense to add an entry to the upgrade guide in https://github.com/apache/datafusion/blob/main/docs/source/library-user-guide/upgrading/55.0.0.md.

Added

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Looks like all regressions are fixed now. TBH this is quite a complicated solution for a problem which would not exist if we simply avoid caching in this case.

@mkleen

mkleen commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@kosiew Do you maybe have time for a second opinion?

@kosiew kosiew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Phoenix500526
Thanks for the fix. I think the schema-aware cache key is the right direction, but I think the implementation can be simplified a bit before this lands.

/// nullability, in order. It deliberately excludes field/schema metadata, which
/// cannot affect statistics — including it would needlessly fragment the cache.
#[derive(Clone, Debug)]
pub struct SchemaFingerprint {

@kosiew kosiew Jun 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema-aware key looks correct, and I think this fixes the bug. That said, the implementation feels a bit more involved than this cache path needs.

Could we simplify SchemaFingerprint to a small derived newtype over something like Vec<(String, DataType, bool)>, or an equivalent representation, and rely on derived Hash and Eq? The precomputed hash plus custom PartialEq collision handling adds some cleverness that feels hard to justify here unless profiling shows schema-key hashing is material.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a look at the benchmarks results clickbench_partitioned and wide_schema you will find that without this optimization there are real regressions.

fn heap_size(&self, ctx: &mut DFHeapSizeCtx) -> usize {
self.path.as_ref().heap_size(ctx)
+ self.table.heap_size(ctx)
+ self.schema.as_ref().heap_size(ctx)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileStatisticsCacheKey::heap_size appears to deep-count the shared SchemaFingerprint for every cached file key. Since ListingTable now shares one Arc<SchemaFingerprint> across all files for the same table and schema, this could overstate cache memory for wide schemas with many files and lead to earlier eviction than necessary.

Could we count only the incremental cache-owned cost here, or add a small test that documents the intended accounting tradeoff?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

catalog Related to the catalog crate common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation execution Related to the execution crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make file-statistics cache keys schema-aware

5 participants