feat: Introduce row index metadata column #1272

lbhm · 2025-09-09T09:53:58Z

What changes are proposed in this pull request?

This PR follows up on #1266 and adds support for reading the row index metadata column to the default engine. The implementation directly follows the approach proposed in #920 and slightly modifies it to match the new metadata column API.

Quoting from #920

Deletion vectors (and row tracking, eventually) rely on accurate file-level row indexes. But they're not implemented in the kernel's default parquet reader. That means we must rely on the position of rows in data batches returned by each read, and we cannot apply optimizations such as stats-based row group skipping (see #860).

Add row index support to the default Parquet reader, in the form of a new RowIndex variant of ReorderIndexTransform. [...] The default parquet reader recognizes (the RowIndex metadata) column and injects a transform to generate row indexes (with appropriate adjustments for any row group skipping that might occur).

Fixes #919

NOTE: If/when arrow-rs parquet reader gains native support for row indexes, e.g. apache/arrow-rs#7307, we should switch to using that. Our solution here is not robust to advanced parquet reader features like page-level skipping. row-level predicate pushdown, etc.

This PR affects the following public APIs

None - the breaking changes were introduced in #1266.

How was this change tested?

New UT.

codecov · 2025-09-09T09:57:09Z

Codecov Report

❌ Patch coverage is 80.55556% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.28%. Comparing base (459e832) to head (35cadc6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
kernel/src/engine/arrow_utils.rs	86.48%	15 Missing and 5 partials ⚠️
kernel/src/engine/default/parquet.rs	42.85%	7 Missing and 1 partial ⚠️
kernel/src/engine/parquet_row_group_skipping.rs	69.23%	1 Missing and 3 partials ⚠️
kernel/src/engine/sync/parquet.rs	40.00%	0 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1272      +/-   ##
==========================================
+ Coverage   84.23%   84.28%   +0.04%     
==========================================
  Files         111      111              
  Lines       27497    27641     +144     
  Branches    27497    27641     +144     
==========================================
+ Hits        23163    23296     +133     
- Misses       3203     3215      +12     
+ Partials     1131     1130       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

scovich

Approach LGTM. All the blockers are lower in the PR stack, I believe.

kernel/src/engine/arrow_utils.rs

kernel/tests/read.rs

scovich · 2025-09-12T15:02:49Z

kernel/src/engine/arrow_utils.rs

+                    Some(MetadataColumnSpec::RowIndex) => {
+                        debug!("Inserting a row index column: {}", field.name());
+                        reorder_indices.push(ReorderIndex::row_index(
+                            requested_position,
+                            Arc::new(field.try_into_arrow()?),
+                        ));
+                    }


From the other PR discussions, it seems like we have two possible approaches for resolving name collisions with metadata:

Forbid name collisions (at least vs. table schema). Note that this can still lead to name collisions at file level, because there's no rule forbidding the parquet file to have extra columns in it.

Treat metadata columns as special, and don't even look at the parquet schema when resolving them.

If we go with approach 2/, then this loop even need to check for metadata columns because the requested schema never contains them. If we go with approach 1/... not sure. Do we blow up with an "incompatible parquet file" error of some kind?

If we go with approach 2/, my understanding is that this loop can stay as is because it loops over the fields from the requested schema that couldn't be found in the parquet columns.
Since we modify match_parquet_fields to always consider metadata columns as not matched/found in #1266, this loop should be good as is?

nicklan

looks good! would just like to make the row computation optional

kernel/src/engine/default/parquet.rs

kernel/src/engine/arrow_utils.rs

kernel/src/engine/parquet_row_group_skipping.rs

lbhm · 2025-09-14T08:38:48Z

@nicklan @scovich, I made row index computation optional based on @nicklan's suggestion. Could you take another look and LMK if you agree with the approach?

Address review comments Address 2nd review comments Make row indexes optional

nicklan

lgtm! thanks, excited to have this support

zachschuermann

I think i can stamp to unblock and let's take follow-ups but main concern is lacking a test for whenever we have a predicate? (lmk if i've overlooked it)

zachschuermann · 2025-09-19T21:00:54Z

kernel/src/engine/arrow_utils.rs

+        let starting_offsets = match self.row_group_ordinals {
+            Some(ordinals) => ordinals
+                .iter()
+                .map(|i| self.row_group_row_index_ranges[*i].clone())


do we need clone?

also I think raw indexing can cause panics - can we see if we can do better here either docs for invariant or add bounds check etc.?

Good point about panicking in raw indexing! I changed this to get() to ignore invalid indices in optimized builds and added a debug_assert! that should catch improper use of RowIndexBuilder::select_row_groups (i.e., out-of-bounds indices in self.row_group_ordinals) during tests.

As for the clone: I think clone is a reasonable trade-off here, since (1) we are only cloning Ranges (i.e., two i64s) and (2) preventing the clone would require us to modify the underlying vector (costly) or designing something like a FilteredRowIndexIterator that does not collect intermediate results.

zachschuermann · 2025-09-19T21:06:54Z

kernel/src/engine/arrow_utils.rs

 pub(crate) fn fixup_parquet_read<T>(
    batch: RecordBatch,
    requested_ordering: &[ReorderIndex],
+    row_indexes: Option<&mut <RowIndexBuilder as IntoIterator>::IntoIter>,


the <Struct as Trait>::AssociatedType feels a little hard to parse (I understand writing the full type above would be worse) - wonder if we can improve just with type alias?

I introduced type FlattenedRangeIterator<T>.

zachschuermann · 2025-09-19T21:21:25Z

kernel/tests/read.rs

+    ])?);
+
+    let snapshot = Snapshot::builder_for(location).build(engine.as_ref())?;
+    let scan = snapshot.scan_builder().with_schema(schema).build()?;


do we have a test with a predicate to test the skipping logic?

I added unit tests for row group skipping in arrow_utils.rs. As for integration tests, we first need to address #860 to support predicate push-down in scans, right?

zachschuermann · 2025-09-19T21:38:52Z

I quickly vibe-coded a test to derisk merge + release. seems like everything is fine, will follow up with it shortly after release cc @nicklan

…Builder (#1334) ## What changes are proposed in this pull request? This PR follows up on #1272 and addresses open review comments. - It removes raw pointer indexing that can theoretically lead to panics. - We refactor `impl IntoIter` to a fallible `RowInderBuilder::build` method that verifies that row group ordinals are unique and within bounds. - It replaces complex type signatures with more intuitive type aliases. - It adds more unit tests for `RowIndexBuilder`. Note: We cannot add integration tests for row index reads with filter predicates before addressing #860 and implementing predicate pushdown. ## How was this change tested? New UT. --------- Co-authored-by: Zach Schuermann <[email protected]>

lbhm marked this pull request as draft September 9, 2025 09:54

github-actions bot assigned lbhm Sep 9, 2025

github-actions bot added the breaking-change Change that require a major version bump label Sep 9, 2025

lbhm marked this pull request as ready for review September 10, 2025 18:51

lbhm force-pushed the row-index-metadata-column branch from 58651a0 to b66e562 Compare September 11, 2025 07:57

lbhm requested a review from scovich September 11, 2025 07:59

scovich mentioned this pull request Sep 11, 2025

feat!: Introduce metadata column API #1266

Merged

scovich approved these changes Sep 11, 2025

View reviewed changes

lbhm force-pushed the row-index-metadata-column branch from b66e562 to 38d4963 Compare September 12, 2025 14:19

scovich reviewed Sep 12, 2025

View reviewed changes

nicklan reviewed Sep 12, 2025

View reviewed changes

kernel/src/engine/default/parquet.rs Outdated Show resolved Hide resolved

kernel/src/engine/arrow_utils.rs Outdated Show resolved Hide resolved

kernel/src/engine/parquet_row_group_skipping.rs Show resolved Hide resolved

lbhm requested review from nicklan and scovich September 14, 2025 08:38

zachschuermann self-requested a review September 16, 2025 17:34

lbhm force-pushed the row-index-metadata-column branch 3 times, most recently from 333bd45 to 8eceb51 Compare September 18, 2025 08:58

feat!: Introduce RowIndex metadata column

465784a

Address review comments Address 2nd review comments Make row indexes optional

lbhm force-pushed the row-index-metadata-column branch from 8eceb51 to 465784a Compare September 18, 2025 20:01

Merge branch 'main' into row-index-metadata-column

35cadc6

nicklan approved these changes Sep 19, 2025

View reviewed changes

zachschuermann approved these changes Sep 19, 2025

View reviewed changes

zachschuermann merged commit 9880c2c into delta-io:main Sep 19, 2025
20 of 21 checks passed

zachschuermann removed the breaking-change Change that require a major version bump label Sep 19, 2025

zachschuermann changed the title ~~feat!: Introduce row index metadata column~~ feat: Introduce row index metadata column Sep 19, 2025

This was referenced Sep 22, 2025

refactor: Remove raw pointer indexing and add unit tests for RowIndexBuilder #1334

Merged

feat: Add row index support to the default parquet reader #920

Closed

feat: Introduce row index metadata column #1272

feat: Introduce row index metadata column #1272

Uh oh!

Conversation

lbhm commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

This PR affects the following public APIs

How was this change tested?

Uh oh!

codecov bot commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

scovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nicklan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lbhm commented Sep 14, 2025

Uh oh!

nicklan left a comment

Choose a reason for hiding this comment

Uh oh!

zachschuermann left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zachschuermann commented Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

lbhm commented Sep 9, 2025 •

edited

Loading

codecov bot commented Sep 9, 2025 •

edited

Loading