IndexedDB: split media content and media metadata into separate object stores in `MediaStore` #5795

mgoldenberg · 2025-10-21T02:58:18Z

Background

This pull request is part of a series of pull requests to add a full IndexedDB implementation of the EventCacheStore and MediaStore (see #4617, #4996, #5090, #5138, #5226, #5274, #5343, #5384, #5406, #5414, #5497, #5506, #5540, #5574, #5603, #5676, #5682, #5749). This particular pull request changes the schema for how media is stored by splitting content and metadata into separate object stores.

The reason for this change is that IndexedDB does not allow partial updates to existing objects. So, when content and metadata are stored in the same object store, updating a metadata field requires deserializing and re-serializing the entire object, which includes content that has the potential to be very large - e.g., an image, a document, etc.

Changes

Adding separate stores for content and metadata

The overarching change is to add two new object stores, one for content and one for metadata.

`MediaContent`

The content store is very straightforward and simply maps an identifier (u64) to the content, illustrated in the type below.

pub struct MediaContent {
    pub id: u64,
    pub data: Vec<u8>,
}

When storing a MediaContent object in IndexedDB, one must find an unused u64. Currently, this is accomplished by querying the object store for the largest identifier, incrementing it, and then using it as the identifier for the desired object.

Note that IndexedDB does offer auto-incrementing keys; however, it's not clear if it's possible to retrieve the generated key upon insertion into the database via indexed_db_futures. So, one must ultimately query the database to get access to the key in a similar fashion to that described above.

`MediaMetadata`

The metadata store is almost identical to the original media store, but contains some additional information for tracking the identifier and the size of the MediaContent, as illustrated in the type below.

pub struct MediaMetadata {
    pub request_parameters: MediaRequestParameters,
    pub last_access: UnixTime,
    pub ignore_policy: IgnoreMediaRetentionPolicy,
    pub content_id: u64,
    pub content_size: usize,
}

When storing a MediaMetadata object in IndexedDB, one must first store the MediaContent. Once the MediaContent is stored, one can determine its identifier and its encoded size which can be used to populate MediaMetadata::content_id and MediaMetadata::content_size.

Note that this means that retrieving MediaContent via MediaRequestParameters requires two steps.

Using MediaRequestParameters to retrieve MediaMetadata
Using MediaMetadata::content_id to retrieve MediaContent

Removing original media store

After the two object stores above were created and the implementations of various functions were updated to use those object stores, the original media store and its associated types were removed. There is one exception, which is that the top-level Media type was kept in place, as it proved to be a useful top-level abstraction.

Tradeoffs

Improvements

These changes offer significant improvements on the following operations.

MediaStore::replace_media_key - changes request parameters - i.e., primary key - of media
- Before: read and write media metadata and media content
- After: read and write only media metadata
MediaStore::get_media_content - retrieves media by request parameters and sets last access time
- Before: read and write media metadata and media content
- After: read media metadata and media content, write media metadata
MediaStore::get_media_content_for_uri - retrieves media by URI and sets the last access time
- Before: read and write media metadata and media content
- After: read media metadata and media content, write media metadata
MediaStore::set_ignore_media_retention_policy - sets whether to ignore media retention policy
- Before: read and write media metadata and media content
- After: read and write only media metadata

Penalties

On the other hand, there are also some penalties due to the updated schema.

MediaStore::add_media_content - adds media
- Before: write media in one operation
- After: write media in three operations (read metadata, write content, write metadata)
MediaStore::get_media_content - retrieves media by request parameters and sets last access time
- Before: read and write media in two operations (read media, write media)
- After: read and write media in three operations (read metadata, read content, write metadata)
MediaStore::remove_media_content - removes media by request parameters
- Before: remove media in a single operation
- After: remove media in three operations (read metadata, remove content, remove metadata)
MediaStore::get_media_content_for_uri - retrieves media by URI and sets the last access time
- Before: read and write media in two operations (read media, write media)
- After: read and write media in three operations (read metadata, read content, write metadata)
MediaStore::remove_media_content_for_uri - removes media by URI
- Before: remove media in a single operation
- After: remove media in three operations (read metadata, remove content, remove metadata)
MediaStore::clean - clean store by removing oversized and old media
- Before: remove media ranges in a single operation
- After: remove media ranges in many operations (read metadata, remove content, remove metadata)

Conclusions

My feeling is that this implementation is an improvement overall. That being said, some benchmarking would offer a greater degree of confidence that MediaStore::clean has not deteriorated significantly. If this is desired, we can pursue this to get a better sense of the penalty.

In any case, I don't think it would be wise to return to a single object store, but perhaps there is some way to improve upon the split object stores.

Future Work

Refactor feature flags
- The current feature flags are a bit convoluted and could be simplified and made more modular
Expose EventCacheStore and MediaStore outside of the matrix-sdk-indexeddb

Public API changes documented in changelogs (optional)

Signed-off-by: Michael Goldenberg [email protected]

Signed-off-by: Michael Goldenberg <[email protected]>

…ed id Signed-off-by: Michael Goldenberg <[email protected]>

Signed-off-by: Michael Goldenberg <[email protected]>

…ations Signed-off-by: Michael Goldenberg <[email protected]>

Signed-off-by: Michael Goldenberg <[email protected]>

…le media content id Signed-off-by: Michael Goldenberg <[email protected]>

Signed-off-by: Michael Goldenberg <[email protected]>

…pe serializer Signed-off-by: Michael Goldenberg <[email protected]>

…nd its derivatives Signed-off-by: Michael Goldenberg <[email protected]>

…put_item_if} and its derivatives Signed-off-by: Michael Goldenberg <[email protected]>

… ranges Signed-off-by: Michael Goldenberg <[email protected]>

… content key Signed-off-by: Michael Goldenberg <[email protected]>

Signed-off-by: Michael Goldenberg <[email protected]>

…data keys Signed-off-by: Michael Goldenberg <[email protected]>

…eys by index Signed-off-by: Michael Goldenberg <[email protected]>

…ata keys via generalized fn Signed-off-by: Michael Goldenberg <[email protected]>

… metadata and media content stores Signed-off-by: Michael Goldenberg <[email protected]>

…ndexed Signed-off-by: Michael Goldenberg <[email protected]>

Signed-off-by: Michael Goldenberg <[email protected]>

…level media type Signed-off-by: Michael Goldenberg <[email protected]>

Signed-off-by: Michael Goldenberg <[email protected]>

codspeed-hq · 2025-10-21T03:18:34Z

CodSpeed Performance Report

Merging #5795 will not alter performance

_{Comparing mgoldenberg:indexeddb-media-store-separate-metadata-and-content (175bdc7) with main (430304f)}

Summary

✅ 50 untouched

codecov · 2025-10-21T03:21:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.45%. Comparing base (430304f) to head (175bdc7).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5795      +/-   ##
==========================================
- Coverage   88.45%   88.45%   -0.01%     
==========================================
  Files         360      360              
  Lines      100328   100328              
  Branches   100328   100328              
==========================================
- Hits        88749    88745       -4     
- Misses       7413     7418       +5     
+ Partials     4166     4165       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

poljar

This looks pretty good, thanks for splitting this up into many small commits.

I wonder what you think about utilizing a different type for the IDs.

Besides that I left a couple of nits about consistent naming of the content ID.

crates/matrix-sdk-indexeddb/src/media_store/types.rs

crates/matrix-sdk-indexeddb/src/media_store/serializer/indexed_types.rs

crates/matrix-sdk-indexeddb/src/transaction/mod.rs

poljar · 2025-10-24T13:11:41Z

Oh, another thing. Please add a changelog entry.

mgoldenberg · 2025-10-24T14:19:58Z

Oh, another thing. Please add a changelog entry.

So, I haven't left changelog entries on any of the pull requests in this series because none of this code is accessible outside of the crate at the moment. Have I been thinking about this correctly? I am happy to add a changelog, just want to make sure we're on the same page.

poljar · 2025-10-24T14:28:18Z

Oh, another thing. Please add a changelog entry.

So, I haven't left changelog entries on any of the pull requests in this series because none of this code is accessible outside of the crate at the moment. Have I been thinking about this correctly? I am happy to add a changelog, just want to make sure we're on the same page.

This should improve the performance of the media access. As such it sounds like it would be a nice thing to announce to our users.

Signed-off-by: Michael Goldenberg <[email protected]>

…metadata in IndexedDB Signed-off-by: Michael Goldenberg <[email protected]>

mgoldenberg · 2025-10-24T17:26:17Z

@poljar: I added a changelog entry, but this is now in conflict with main. Mind if I rebase and force push? I can also squash the fixup commit above.

poljar · 2025-10-24T18:24:08Z

@poljar: I added a changelog entry, but this is now in conflict with main. Mind if I rebase and force push? I can also squash the fixup commit above.

Please wait with rebasing till we're dong with the review phase.

mgoldenberg · 2025-10-24T19:10:43Z

Okay, tests don't seem to run when there is a conflict, so figured it might be worth doing it before.

mgoldenberg added 30 commits October 20, 2025 18:51

refactor(indexeddb): add migrations for media content store

a4deb69

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add type for tracking media content and associat…

58f70a2

…ed id Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): remove indexed media content type synonym

3817419

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add indexed types for media content

06f21c3

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fns for basic media content oper…

3e91cd4

…ations Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fn for getting max key in range

fb921fb

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add constant for representing safe bounds of u64

8c32545

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add key bounds for media content id key

035895f

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fns for getting the next availab…

5c779f4

…le media content id Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): flatten nested media metadata into media type

585888b

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add content id and content size to media metadata

d6cc671

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add migrations for media metadata store

a0ebb5e

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add indexed types and keys for media metadata

8d4ccf3

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fns for getting media metadata

1d84c61

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fns for add/putting media metadata

5744f3a

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fns for deleting media metadata

52e54d1

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): return indexed type and js value from indexed ty…

60b821a

…pe serializer Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): return indexed type from Transaction::add_item a…

f0e5cd6

…nd its derivatives Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): return indexed type from Transaction::{put_item,…

883a1da

…put_item_if} and its derivatives Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add fn for prefixed key ranges from existing key…

038ab57

… ranges Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add type synonym for content id in indexed media…

4d6414e

… content key Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): remove unused type synonym

db4bec8

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add constants for media content id bounds

b877c05

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add content id to media metadata keys

212fdc8

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): rename transaction fn for getting all media meta…

ccb53d9

…data keys Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): add transaction fns for getting media metadata k…

6430c45

…eys by index Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): implement specialized fn for getting media metad…

9909a70

…ata keys via generalized fn Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): re-implement media-related fns in terms of media…

96fc2f4

… metadata and media content stores Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): simplify error type for media metadata impl of i…

05aaf6b

…ndexed Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): remove media object store and associated types

85b8142

Signed-off-by: Michael Goldenberg <[email protected]>

mgoldenberg added 2 commits October 20, 2025 22:12

refactor(indexeddb): remove (de)serialization functionality from top-…

c4f676a

…level media type Signed-off-by: Michael Goldenberg <[email protected]>

doc(indexeddb): fix typos in documentation

175bdc7

Signed-off-by: Michael Goldenberg <[email protected]>

mgoldenberg marked this pull request as ready for review October 21, 2025 03:25

mgoldenberg requested a review from a team as a code owner October 21, 2025 03:25

mgoldenberg requested review from poljar and removed request for a team October 21, 2025 03:25

poljar requested changes Oct 24, 2025

View reviewed changes

mgoldenberg added 4 commits October 24, 2025 12:51

refactor(indexeddb): rename MediaContent::id -> MediaContent::content_id

1283f04

Signed-off-by: Michael Goldenberg <[email protected]>

refactor(indexeddb): use UUID instead of u64 as media content id

c1b5acb

Signed-off-by: Michael Goldenberg <[email protected]>

fixup! refactor(indexeddb): add indexed types for media content

913332f

Signed-off-by: Michael Goldenberg <[email protected]>

doc(indexeddb): add changelog entry for separating media content and …

1797018

…metadata in IndexedDB Signed-off-by: Michael Goldenberg <[email protected]>

mgoldenberg requested a review from poljar October 24, 2025 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IndexedDB: split media content and media metadata into separate object stores in `MediaStore` #5795

IndexedDB: split media content and media metadata into separate object stores in `MediaStore` #5795

Uh oh!

mgoldenberg commented Oct 21, 2025

Uh oh!

codspeed-hq bot commented Oct 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 21, 2025

Uh oh!

poljar left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poljar commented Oct 24, 2025

Uh oh!

mgoldenberg commented Oct 24, 2025

Uh oh!

poljar commented Oct 24, 2025

Uh oh!

mgoldenberg commented Oct 24, 2025

Uh oh!

poljar commented Oct 24, 2025

Uh oh!

mgoldenberg commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IndexedDB: split media content and media metadata into separate object stores in MediaStore #5795

Are you sure you want to change the base?

IndexedDB: split media content and media metadata into separate object stores in MediaStore #5795

Uh oh!

Conversation

mgoldenberg commented Oct 21, 2025

Background

Changes

Adding separate stores for content and metadata

MediaContent

MediaMetadata

Removing original media store

Tradeoffs

Improvements

Penalties

Conclusions

Future Work

Uh oh!

codspeed-hq bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #5795 will not alter performance

Summary

Uh oh!

codecov bot commented Oct 21, 2025

Codecov Report

Uh oh!

poljar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poljar commented Oct 24, 2025

Uh oh!

mgoldenberg commented Oct 24, 2025

Uh oh!

poljar commented Oct 24, 2025

Uh oh!

mgoldenberg commented Oct 24, 2025

Uh oh!

poljar commented Oct 24, 2025

Uh oh!

mgoldenberg commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IndexedDB: split media content and media metadata into separate object stores in `MediaStore` #5795

IndexedDB: split media content and media metadata into separate object stores in `MediaStore` #5795

`MediaContent`

`MediaMetadata`

codspeed-hq bot commented Oct 21, 2025 •

edited

Loading