Skip to content

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Oct 31, 2025

This PR is part of an EPIC to push down hash table references from HashJoinExec into scans. The EPIC is tracked in #17171.

A "target state" is tracked in #18393 (this PR).
There is a series of PRs to get us to this target state in smaller more reviewable changes that are still valuable on their own:

As those are merged I will rebase this PR to keep track of the "remaining work", and we can use this PR to explore big picture ideas or benchmarks of the final state.

@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate physical-plan Changes to the physical-plan crate labels Oct 31, 2025
@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Oct 31, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Oct 31, 2025
Copy link
Contributor Author

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving my review comments, will post benchmarks afterwards.

Although this PR is large I think there's a clear path to split it up into independent smaller PRs:

  1. Refactor create_hashes to accept references (changes only to hash_utils.rs)
  2. Refactor InListExpr to store arrays and support structs, re-using create_hashes_from_arrays from (1) (changes only to in_list.rs).
  3. Refactor the data structures used to track pushdown data in HashJoinExec (changes only to files in datafusion/physical-plan/src/joins/hash_join/).
  4. Introduce the CASE statement structure into the filter pushdown of HashJoinExec and the repartition hash PhysicalExpr (changes only to files in datafusion/physical-plan/src/joins/hash_join/, adds HashExpr in datafusion/physical-plan/src/joins/hash_join/partitioned_hash_eval.rs).
  5. Add hash table pushdown (adds HashTableLookupExpr in datafusion/physical-plan/src/joins/hash_join/partitioned_hash_eval.rs), adds HashTable to PushdownStrategy, adds create_membership_predicate, etc.).
  6. Add InListExpr pushdown (adds PushdownStrategy::InList, etc.)

There is also potential to remove the barrier expression by filtering out only from known partitions CASE ... ELSE true i.e. if we don't have information for all partitions only filter out data that we know won't match in our partition and then update the filter to ELSE false in the case where we have information from all partitions. This might be useful for the distributed case, not sure though.

I think we could somehow unify the hashing / inner data structures of the join hash table and the InList expression - they are very similar - to at least eliminate one round of hashing. I wonder if there's a version of an InList expression that avoids building a Vec<ScalarValue> altogether and instead just wraps an ArrayRef + metadata (data types etc) + an optional hash lookup. That would be quite versatile, we could essentially replace the join hash tables with that structure.

Comment on lines 377 to 379
#[cfg(not(feature = "force_hash_collisions"))]
pub fn create_hashes<'a>(
arrays: &[ArrayRef],
pub fn create_hashes_from_arrays<'a>(
arrays: &[&dyn Array],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend this for its own PR.

I think this is a nice refactor for this function, however I decided not to deprecate / replace it to avoid churn. If this were it's own PR I think it would be worth it to just make a new version and deprecate the old one, replacing all references in DataFusion (which I imagine is most users; even if this is pub I think it's mostly pub to be used in other crates within this repo). I did not investigate if this can help avoid clones in any other call sites.

- RepartitionExec: partitioning=Hash([a@0, b@1], 12), input_partitions=1
- DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[a, b, e], file_type=test, pushdown_supported=true, predicate=DynamicFilter [ a@0 >= ab AND a@0 <= ab AND b@1 >= bb AND b@1 <= bb OR a@0 >= aa AND a@0 <= aa AND b@1 >= ba AND b@1 <= ba ]
- RepartitionExec: partitioning=Hash([a@0, b@1], 4), input_partitions=1
- DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[a, b, e], file_type=test, pushdown_supported=true, predicate=DynamicFilter [ CASE hash_repartition % 4 WHEN 0 THEN a@0 >= aa AND a@0 <= aa AND b@1 >= ba AND b@1 <= ba AND struct(a@0, b@1) IN (SET) ([{c0:aa,c1:ba}]) WHEN 2 THEN a@0 >= ab AND a@0 <= ab AND b@1 >= bb AND b@1 <= bb AND struct(a@0, b@1) IN (SET) ([{c0:ab,c1:bb}]) ELSE false END ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this automatically excludes empty partitions and defaults to false, which works with inner joins. I'm not sure how we'd structure this for other join types.

- RepartitionExec: partitioning=Hash([a@0, b@1], 12), input_partitions=1
- DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[a, b, e], file_type=test, pushdown_supported=true, predicate=DynamicFilter [ a@0 >= aa AND a@0 <= ab AND b@1 >= ba AND b@1 <= bb ]
- RepartitionExec: partitioning=Hash([a@0, b@1], 4), input_partitions=1
- DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[a, b, e], file_type=test, pushdown_supported=true, predicate=DynamicFilter [ a@0 >= aa AND a@0 <= ab AND b@1 >= ba AND b@1 <= bb AND struct(a@0, b@1) IN (SET) ([{c0:aa,c1:ba}, {c0:ab,c1:bb}]) ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case there's only 1 partition with data (because of hash collisions) -> we optimize away the CASE expression. This is relevant because the same thing would happen with a point lookup primary key join.

Comment on lines 147 to 151
/// Specialized Set implementation for StructArray
struct StructArraySet {
array: Arc<StructArray>,
hash_set: ArrayHashSet,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a nice improvement / feature that can easily be it's own PR.

let has_nulls = self.array.null_count() != 0;

// Compute hashes for all rows in the input array
let mut input_hashes = vec![0u64; v.len()];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a thread local to avoid repeated re-allocations here? I imagine that would work quite well.

Comment on lines -110 to -111
/// Create a new `JoinLeftData` from its parts
pub(super) fn new(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy was complaining about too many arguments, a constructor was not necessary anyway

Comment on lines +1496 to +1509
let membership = if num_rows == 0 {
PushdownStrategy::Empty
} else {
// If the build side is small enough we can use IN list pushdown.
// If it's too big we fall back to pushing down a reference to the hash table.
// See `PushdownStrategy` for more details.
if let Some(in_list_values) =
build_struct_inlist_values(&left_values, max_inlist_size)?
{
PushdownStrategy::InList(in_list_values)
} else {
PushdownStrategy::HashTable(Arc::clone(&hash_map))
}
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some remodeling of the data structures we use to track state. I think the new structure is much better - even if we didn't move forward with the rest of the PR.

Comment on lines 63 to 73
// Size check using built-in method
// This is not 1:1 with the actual size of ScalarValues, but it is a good approximation
// and at this point is basically "free" to compute since we have the arrays already.
let estimated_size = join_key_arrays
.iter()
.map(|arr| arr.get_array_memory_size())
.sum::<usize>();

if estimated_size > max_size_bytes {
return Ok(None);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is where we check the size to set the size limit.

@@ -0,0 +1,292 @@
// Licensed to the Apache Software Foundation (ASF) under one
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part used to handle the cases where we push down the entire hash table.

Comment on lines 208 to 236
/// Build-side data reported by a single partition
pub(crate) enum PartitionBuildData {
Partitioned {
partition_id: usize,
pushdown: PushdownStrategy,
bounds: PartitionBounds,
},
CollectLeft {
pushdown: PushdownStrategy,
bounds: PartitionBounds,
},
}

/// Per-partition accumulated data (Partitioned mode)
#[derive(Clone)]
struct PartitionData {
bounds: PartitionBounds,
pushdown: PushdownStrategy,
}

/// Build-side data organized by partition mode
enum AccumulatedBuildData {
Partitioned {
partitions: Vec<Option<PartitionData>>,
},
CollectLeft {
data: Option<PartitionData>,
},
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the refactoring of the data structures we store to track state. It's now much cleaner, e.g. avoids comparing partitions by id.

@adriangb
Copy link
Contributor Author

adriangb commented Oct 31, 2025

Here's one interesting benchmark:

COPY (SELECT uuid() as k, uuid() as v FROM generate_series(1, 5) t(i))
TO 'small_table_uuids.parquet'
OPTIONS (
  'MAX_ROW_GROUP_SIZE' '50000',
  'BLOOM_FILTER_ENABLED::k' 'true'
);
COPY (SELECT random()::text as v1, random()::text as v2, uuid() as k FROM generate_series(1, 100000000) t(i))
TO 'large_table_uuids.parquet'
OPTIONS (
  'MAX_ROW_GROUP_SIZE' '50000',
  'BLOOM_FILTER_ENABLED::k' 'true'
);

CREATE EXTERNAL TABLE small_table STORED AS PARQUET LOCATION 'small_table_uuids.parquet';
CREATE EXTERNAL TABLE large_table STORED AS PARQUET LOCATION 'large_table_uuids.parquet';

SET datafusion.execution.parquet.pushdown_filters = true;
SET datafusion.execution.parquet.reorder_filters = true;
SET datafusion.runtime.metadata_cache_limit = 0;

-- Join the two tables, with a filter on small_table
SELECT *
FROM small_table s JOIN large_table l ON s.k = l.k;

This is similar to the query we benchmarked in our recent blog post but using UUIDs instead of ints so that the min/max stats pushdown doesn't really help (hence why main is ~ same as DuckDB instead of faster as in the blog post).

Full benchmark script
#!/usr/bin/env python3
"""
Benchmark script comparing DataFusion with/without inlist pushdown vs DuckDB.

Groups:
1. branch (no inlist): hash_join_inlist_pushdown_max_size = 0
2. branch (w/ inlist): hash_join_inlist_pushdown_max_size = default (999999)
3. main: using datafusion-cli-main
4. duckdb: using duckdb CLI
"""

import subprocess
import tempfile
import time
import os
import sys
from pathlib import Path

# Configuration
DATAFUSION_CLI = "./target/release/datafusion-cli"
DATAFUSION_CLI_MAIN = "./datafusion-cli-main"
DUCKDB_CLI = "duckdb"
NUM_RUNS = 5  # Number of times to run each benchmark

# Data generation settings
SMALL_TABLE_SIZE = 5
LARGE_TABLE_SIZE = 100_000_000
SMALL_TABLE_FILE = "small_table_uuids.parquet"
LARGE_TABLE_FILE = "large_table_uuids.parquet"


def run_command(cmd, input_sql=None, description=""):
    """Run a command and measure execution time."""
    print(f"  Running: {description}...", end=" ", flush=True)

    start = time.time()
    try:
        if input_sql:
            result = subprocess.run(
                cmd,
                input=input_sql,
                capture_output=True,
                text=True,
                timeout=600  # 10 minute timeout
            )
        else:
            result = subprocess.run(
                cmd,
                capture_output=True,
                text=True,
                timeout=600
            )

        elapsed = time.time() - start

        if result.returncode != 0:
            print(f"FAILED (exit code {result.returncode})")
            print(f"    stderr: {result.stderr}")
            return None

        print(f"{elapsed:.3f}s")
        return elapsed
    except subprocess.TimeoutExpired:
        print("TIMEOUT")
        return None
    except Exception as e:
        print(f"ERROR: {e}")
        return None


def create_data():
    """Create test data files if they don't exist."""
    if os.path.exists(SMALL_TABLE_FILE) and os.path.exists(LARGE_TABLE_FILE):
        print(f"Data files already exist, skipping creation.")
        return True

    print(f"Creating test data...")

    data_gen_sql = f"""
COPY (SELECT uuid() as k, uuid() as v FROM generate_series(1, {SMALL_TABLE_SIZE}) t(i))
TO '{SMALL_TABLE_FILE}'
OPTIONS (
  'MAX_ROW_GROUP_SIZE' '50000',
  'BLOOM_FILTER_ENABLED::k' 'true'
);

COPY (SELECT random()::text as v1, random()::text as v2, uuid() as k FROM generate_series(1, {LARGE_TABLE_SIZE}) t(i))
TO '{LARGE_TABLE_FILE}'
OPTIONS (
  'MAX_ROW_GROUP_SIZE' '50000',
  'BLOOM_FILTER_ENABLED::k' 'true'
);
"""

    result = subprocess.run(
        [DATAFUSION_CLI],
        input=data_gen_sql,
        capture_output=True,
        text=True
    )

    if result.returncode != 0:
        print(f"Failed to create data: {result.stderr}")
        return False

    print(f"Data created successfully.")
    return True


def create_datafusion_sql(inlist_size):
    """Create SQL for DataFusion with specified inlist pushdown size."""
    return f"""
CREATE EXTERNAL TABLE small_table STORED AS PARQUET LOCATION '{SMALL_TABLE_FILE}';
CREATE EXTERNAL TABLE large_table STORED AS PARQUET LOCATION '{LARGE_TABLE_FILE}';

SET datafusion.execution.parquet.pushdown_filters = true;
SET datafusion.execution.parquet.reorder_filters = true;
SET datafusion.optimizer.hash_join_inlist_pushdown_max_size = {inlist_size};
SET datafusion.runtime.metadata_cache_limit = '0M';

SELECT *
FROM small_table s JOIN large_table l ON s.k = l.k;
"""


def create_duckdb_sql():
    """Create SQL for DuckDB."""
    return f"""
SELECT *
FROM '{SMALL_TABLE_FILE}' s JOIN '{LARGE_TABLE_FILE}' l ON s.k = l.k;
"""


def run_benchmark_group(name, cmd, sql_content, num_runs=NUM_RUNS):
    """Run a benchmark group multiple times and collect results."""
    print(f"\n{name}:")
    times = []

    for i in range(num_runs):
        elapsed = run_command(cmd, input_sql=sql_content, description=f"Run {i+1}/{num_runs}")
        if elapsed is not None:
            times.append(elapsed)

    if times:
        avg = sum(times) / len(times)
        min_time = min(times)
        max_time = max(times)
        print(f"  Results: min={min_time:.3f}s, avg={avg:.3f}s, max={max_time:.3f}s")
        return times
    else:
        print(f"  No successful runs")
        return []


def main():
    print("=" * 60)
    print("DataFusion Inlist Pushdown Benchmark")
    print("=" * 60)

    # Verify executables exist
    if not os.path.exists(DATAFUSION_CLI):
        print(f"Error: {DATAFUSION_CLI} not found")
        sys.exit(1)

    if not os.path.exists(DATAFUSION_CLI_MAIN):
        print(f"Error: {DATAFUSION_CLI_MAIN} not found")
        sys.exit(1)

    try:
        subprocess.run([DUCKDB_CLI, "--version"], capture_output=True, check=True)
    except (subprocess.CalledProcessError, FileNotFoundError):
        print(f"Error: duckdb CLI not found or not working")
        sys.exit(1)

    # Create data
    if not create_data():
        sys.exit(1)

    # Run benchmarks
    results = {}

    # 1. Branch without inlist pushdown
    results["branch_no_inlist"] = run_benchmark_group(
        "Branch (no inlist, size=0)",
        [DATAFUSION_CLI],
        create_datafusion_sql(0)
    )

    # 2. Branch with inlist pushdown
    results["branch_with_inlist"] = run_benchmark_group(
        "Branch (w/ inlist, size=999999)",
        [DATAFUSION_CLI],
        create_datafusion_sql(999999)
    )

    # 3. Main branch
    results["main"] = run_benchmark_group(
        "Main branch",
        [DATAFUSION_CLI_MAIN],
        create_datafusion_sql(999999)
    )

    # 4. DuckDB
    results["duckdb"] = run_benchmark_group(
        "DuckDB",
        [DUCKDB_CLI],
        create_duckdb_sql()
    )

    # Summary
    print("\n" + "=" * 60)
    print("SUMMARY")
    print("=" * 60)

    for name, times in results.items():
        if times:
            avg = sum(times) / len(times)
            print(f"{name:25s}: {avg:.3f}s avg over {len(times)} runs")
        else:
            print(f"{name:25s}: No successful runs")

    print("\nAll times (seconds):")
    for name, times in results.items():
        if times:
            times_str = ", ".join(f"{t:.3f}" for t in times)
            print(f"  {name}: [{times_str}]")


if __name__ == "__main__":
    main()
============================================================
SUMMARY
============================================================
branch_no_inlist         : 1.010s avg over 5 runs
branch_with_inlist       : 0.290s avg over 5 runs
main                     : 1.696s avg over 5 runs
duckdb                   : 1.634s avg over 5 runs

@adriangb
Copy link
Contributor Author

❯ ./bench.sh compare main use-in-list         
Comparing main and use-in-list
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ use-in-list ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  516.69 ms │   503.63 ms │     no change │
│ QQuery 2     │  101.85 ms │   126.92 ms │  1.25x slower │
│ QQuery 3     │  266.65 ms │   262.82 ms │     no change │
│ QQuery 4     │  220.78 ms │   208.05 ms │ +1.06x faster │
│ QQuery 5     │  393.63 ms │   379.17 ms │     no change │
│ QQuery 6     │  145.39 ms │   142.40 ms │     no change │
│ QQuery 7     │  542.04 ms │   557.89 ms │     no change │
│ QQuery 8     │  437.90 ms │   415.90 ms │ +1.05x faster │
│ QQuery 9     │  647.38 ms │   611.35 ms │ +1.06x faster │
│ QQuery 10    │  355.89 ms │   342.20 ms │     no change │
│ QQuery 11    │   78.78 ms │    79.20 ms │     no change │
│ QQuery 12    │  217.50 ms │   195.61 ms │ +1.11x faster │
│ QQuery 13    │  379.35 ms │   348.38 ms │ +1.09x faster │
│ QQuery 14    │  194.42 ms │   178.55 ms │ +1.09x faster │
│ QQuery 15    │  274.45 ms │   259.19 ms │ +1.06x faster │
│ QQuery 16    │   67.99 ms │    59.72 ms │ +1.14x faster │
│ QQuery 17    │  708.83 ms │   630.79 ms │ +1.12x faster │
│ QQuery 18    │ 1002.48 ms │  1055.66 ms │  1.05x slower │
│ QQuery 19    │  319.69 ms │   290.27 ms │ +1.10x faster │
│ QQuery 20    │  253.56 ms │   246.25 ms │     no change │
│ QQuery 21    │  760.56 ms │   697.09 ms │ +1.09x faster │
│ QQuery 22    │   94.52 ms │    78.84 ms │ +1.20x faster │
└──────────────┴────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)          │ 7980.32ms │
│ Total Time (use-in-list)   │ 7669.89ms │
│ Average Time (main)        │  362.74ms │
│ Average Time (use-in-list) │  348.63ms │
│ Queries Faster             │        12 │
│ Queries Slower             │         2 │
│ Queries with No Change     │         8 │
│ Queries with Failure       │         0 │
└────────────────────────────┴───────────┘

@adriangb adriangb marked this pull request as ready for review October 31, 2025 21:59
@xudong963
Copy link
Member

duckdb : 1.634s avg over 5 runs

Surprised that duckdb takes so long, I was thinking it also can push down a inlist for hash join

@adriangb
Copy link
Contributor Author

adriangb commented Nov 1, 2025

duckdb : 1.634s avg over 5 runs

Surprised that duckdb takes so long, I was thinking it also can push down a inlist for hash join

As far as I know they only do min/max stats: https://duckdb.org/2024/09/09/announcing-duckdb-110#dynamic-filter-pushdown-from-joins

@adriangb
Copy link
Contributor Author

adriangb commented Nov 2, 2025

Here are some new numbers after a couple more optimizations to InListExpr:

============================================================
SUMMARY
============================================================
branch_no_inlist         : 0.670s avg over 5 runs
branch_with_inlist       : 0.163s avg over 5 runs
main                     : 1.499s avg over 5 runs
duckdb                   : 1.386s avg over 5 runs

I'd like to clarify why the InListExpr makes such a difference:

  1. It allows some statistics pruning. The pruning expressions know how to handle InList and explode it into a chain of col = 1 OR col = 2 ... when there are less than 20 items in the list. This may have some impact but similar to the current min/max pushdown depends on properties of the data.
  2. It allows bloom filter pruning. If the above case is hit (less than 20 items) then each one of the OR cases will be checked against a bloom filter on the column if one exists.

I think it's mainly that latter optimization that provides the win.

And TPCH:

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ use-in-list ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  516.69 ms │   478.97 ms │ +1.08x faster │
│ QQuery 2     │  101.85 ms │    88.72 ms │ +1.15x faster │
│ QQuery 3     │  266.65 ms │   247.67 ms │ +1.08x faster │
│ QQuery 4     │  220.78 ms │   201.75 ms │ +1.09x faster │
│ QQuery 5     │  393.63 ms │   348.56 ms │ +1.13x faster │
│ QQuery 6     │  145.39 ms │   131.40 ms │ +1.11x faster │
│ QQuery 7     │  542.04 ms │   499.63 ms │ +1.08x faster │
│ QQuery 8     │  437.90 ms │   389.98 ms │ +1.12x faster │
│ QQuery 9     │  647.38 ms │   593.22 ms │ +1.09x faster │
│ QQuery 10    │  355.89 ms │   341.61 ms │     no change │
│ QQuery 11    │   78.78 ms │    70.63 ms │ +1.12x faster │
│ QQuery 12    │  217.50 ms │   195.95 ms │ +1.11x faster │
│ QQuery 13    │  379.35 ms │   338.09 ms │ +1.12x faster │
│ QQuery 14    │  194.42 ms │   176.03 ms │ +1.10x faster │
│ QQuery 15    │  274.45 ms │   246.27 ms │ +1.11x faster │
│ QQuery 16    │   67.99 ms │    62.21 ms │ +1.09x faster │
│ QQuery 17    │  708.83 ms │   648.45 ms │ +1.09x faster │
│ QQuery 18    │ 1002.48 ms │  1118.42 ms │  1.12x slower │
│ QQuery 19    │  319.69 ms │   268.59 ms │ +1.19x faster │
│ QQuery 20    │  253.56 ms │   241.12 ms │     no change │
│ QQuery 21    │  760.56 ms │   719.05 ms │ +1.06x faster │
│ QQuery 22    │   94.52 ms │    74.86 ms │ +1.26x faster │
└──────────────┴────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)          │ 7980.32ms │
│ Total Time (use-in-list)   │ 7481.19ms │
│ Average Time (main)        │  362.74ms │
│ Average Time (use-in-list) │  340.05ms │
│ Queries Faster             │        19 │
│ Queries Slower             │         1 │
│ Queries with No Change     │         2 │
│ Queries with Failure       │         0 │
└────────────────────────────┴───────────┘

@github-actions github-actions bot added the proto Related to proto crate label Nov 2, 2025
- Fix error handling in shared_bounds.rs: properly propagate errors instead of silently swallowing with .ok()
- Improve hash collision handling in InListExpr by sampling array values
- Run cargo fmt to fix formatting issues

Addresses CI feedback on PR apache#18438.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@adriangb adriangb marked this pull request as draft November 2, 2025 07:13
@adriangb
Copy link
Contributor Author

adriangb commented Nov 2, 2025

Marking as draft again. Looking at the changes while they're mostly in the right spirit there's a lot of details that need more manual care. I'll do that before marking as ready for review.

@adriangb
Copy link
Contributor Author

adriangb commented Nov 2, 2025

I've done some cleanup, the first three PRs are ready for review:

That will be most of the changes, the only two follwup PRs will be to push down the hash table references and InListExpr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate proto Related to proto crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants