Skip to content

Conversation

@benwtrent
Copy link
Member

Here we have a new ESAcceptDocs implementation that doesn't fully construct the underlying bit set unless absolutely necessary, allowing for approximate cost to be calculated.

Also, this adds access to the underlying BitSet, as for some usages, being able to calculate nextBitSet can be supremely helpful.

@benwtrent benwtrent requested review from iverase and pmpailis November 7, 2025 16:22
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Nov 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Contributor

@pmpailis pmpailis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; only a couple of non-blocking questions/comments


@Override
public Optional<BitSet> getBitSet() throws IOException {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can return an empty Optional here

Copy link
Contributor

@iverase iverase Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null indicates that we accept all documents. Empty Optional indicates there are some filtered documents but cannot be represented as a BitSet efficiently. So null is ok here.

if (bits instanceof BitSet bitSet) {
return new BitSetIterator(bitSet, maxDoc);
}
return new FilteredDocIdSetIterator(DocIdSetIterator.all(maxDoc)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shortcut here is because we assume that the caller has ensured thatliveDocs will not be null ?

@Override
public Optional<BitSet> getBitSet() {
if (bits == null) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with the other getBitSet maybe we could return an empty optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a trinary expression on purpose.

  • null == all
  • Empty == not efficiently a bitset
  • Present == efficiently a bitset

return new BitSetIterator(acceptBitSet, cardinality);
}
return liveDocs == null
? scorerSupplier.get(NO_MORE_DOCS).iterator()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we use NO_MORE_DOCS here instead of maxDoc ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its the typical pattern. It simply indicates "Hey iterator, we have no ideas on the best skipping pattern, just assume we will iterate everything"

@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Nov 11, 2025
@elasticsearchmachine elasticsearchmachine merged commit 6b63b13 into elastic:main Nov 12, 2025
34 checks passed
@benwtrent benwtrent deleted the add-es-accept-docs branch November 12, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants