Skip to content

Conversation

Shubham8287
Copy link
Contributor

@Shubham8287 Shubham8287 commented Aug 29, 2025

Description of Changes

Current implementation of drop_table have few problems :

  1. Table rows are not deleted, which means blob_store bytes are never reclaimed.
  2. As a result, the generated TxData does not include deleted rows, so the commit log replay logic never drops the in-memory commit table.

Changes:
Make MutTx::drop_table to not drop commit table but only delete rows (using delete_tables) inside it. Later, let CommitState::merge to detect dropped tables by anaylyzing st_table rows deletions and remove commit tables accordingly.

This change fixes above two problems. However, we still need an additional change to drop empty tables during replay (WIP).

Expected complexity level and risk

3, small change but it is in hotpath.

Testing

  • [x ] Modified current unit test to cover table drop case.
  • Planning to write smoketest for testing replay.

@Shubham8287 Shubham8287 marked this pull request as draft August 29, 2025 15:01
@Shubham8287 Shubham8287 self-assigned this Aug 29, 2025
@Shubham8287 Shubham8287 force-pushed the shub/fix-drop-table branch 3 times, most recently from 4e25272 to 19a8706 Compare August 29, 2025 15:22
@Shubham8287 Shubham8287 changed the title fix drop_tables fix: Drop table Aug 29, 2025
@Shubham8287 Shubham8287 force-pushed the shub/fix-drop-table branch 2 times, most recently from e17509a to 4d1166a Compare August 29, 2025 15:34
Copy link
Contributor

@gefjon gefjon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am somewhat unhappy with this solution. It seems simpler to keep the PendingSchemaChange machinery for handling drop_table, and to do one of:

  • Require that the table be empty at the time of MutTx::drop_table.
  • Delete all the rows during MutTx::drop_table, before making the change to the in-memory structures.
  • Visit the pending schema change in CommittedState::merge and delete all the rows then.

// After applying `merge_apply_deletes` and `merge_apply_inserts`,
// any tables no longer referenced by `st_table` should be
// removed from the committed state.
let mut tables_to_drop = thin_vec::ThinVec::<TableId>::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use ThinVec here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the idea is to keep the footprint small as it will be empty most of the time. However, as we never put this into the heap, it doesn't matter that we use 16 bytes more on the stack, so a regular Vec seems better.

// First, apply deletes. This will free up space in the committed tables.
self.merge_apply_deletes(&mut tx_data, tx_state.delete_tables);
self.merge_apply_deletes(&mut tx_data, &mut tables_to_drop, tx_state.delete_tables);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.merge_apply_deletes(&mut tx_data, &mut tables_to_drop, tx_state.delete_tables);
// Any rows deleted from `st_table` will be noted in `tables_to_drop`.
self.merge_apply_deletes(&mut tx_data, &mut tables_to_drop, tx_state.delete_tables);

I don't actually know if this is true (this is the first part of my review), it's just my assumption about what's going on here.


// Then, apply inserts. This will re-fill the holes freed by deletions
// before allocating new pages.
self.merge_apply_inserts(&mut tx_data, tx_state.insert_tables, tx_state.blob_store);
self.merge_apply_inserts(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.merge_apply_inserts(
// Any rows inserted into `st_table` will be removed from `tables_to_drop`.
// This allows us to avoid incorrectly dropping tables whose `st_table` rows have been "updated,"
// i.e. replaced with a new version.
self.merge_apply_inserts(

I don't actually know if this is true (this is the first part of my review), it's just my assumption about what's going on here.

Comment on lines -357 to -363
self.tx_state.delete_tables.remove(&table_id);
let commit_table = self
.committed_state_write_lock
.tables
.remove(&table_id)
.expect("there should be a schema in the committed state if we reach here");
self.push_schema_change(PendingSchemaChange::TableRemoved(table_id, commit_table));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're removing this code, you need to update the comment about how the commit table isn't eagerly deleted, but what code does delete it, and how queries later in the transaction avoid reading from it.

@Shubham8287
Copy link
Contributor Author

  • Require that the table be empty at the time of MutTx::drop_table.
  • Delete all the rows during MutTx::drop_table, before making the change to the in-memory structures.

fundamental problem with removing in-memory Table in MutTx::drop_table is that, CommittedState::merge expects Commit Table to be present to actually delete rows.

It seems simpler to keep the PendingSchemaChange machinery for handling drop_table.

I feel PendingSchemaChange is great for reversing changes that doesn't go in commitlog, but transaction semantic already take care to rollback/commit table rows and responsible for writting commitlog.

@gefjon
Copy link
Contributor

gefjon commented Aug 29, 2025

fundamental problem with removing in-memory Table in MutTx::drop_table is that, CommittedState::merge expects Commit Table to be present to actually delete rows.

This doesn't sound like a fundamental problem. In fact it sounds very solvable. You could either:

  • Assert that a dropped table's delete table is empty, and fail the merge if it is not. This conflicts with my suggestion to require that all the rows be explicitly deleted prior to the MutTx::drop_table call; instead you would have MutTx::drop_table also clear the delete table.
  • Add the set of dropped tables as an additional argument to CommittedState::merge, and fall back to checking there if the committed state does not contain the table for a given delete.

@Shubham8287
Copy link
Contributor Author

  • Add the set of dropped tables as an additional argument to CommittedState::merge, and fall back to checking there if the committed state does not contain the table for a given delete.

Yes, I think we can do this to make it work, but I can't say how is this better than not deleting comit table during MutTx::drop_table and anaylyzing st_table changes during merge to decide on dropped tables.

@Shubham8287
Copy link
Contributor Author

Closing in favor of - #3214

@Shubham8287 Shubham8287 closed this Sep 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants