-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Adds new ESAcceptDocs class and usage, allowing for future use in knn searching #137750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds new ESAcceptDocs class and usage, allowing for future use in knn searching #137750
Conversation
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
pmpailis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; only a couple of non-blocking questions/comments
|
|
||
| @Override | ||
| public Optional<BitSet> getBitSet() throws IOException { | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can return an empty Optional here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
null indicates that we accept all documents. Empty Optional indicates there are some filtered documents but cannot be represented as a BitSet efficiently. So null is ok here.
| if (bits instanceof BitSet bitSet) { | ||
| return new BitSetIterator(bitSet, maxDoc); | ||
| } | ||
| return new FilteredDocIdSetIterator(DocIdSetIterator.all(maxDoc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shortcut here is because we assume that the caller has ensured thatliveDocs will not be null ?
| @Override | ||
| public Optional<BitSet> getBitSet() { | ||
| if (bits == null) { | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as with the other getBitSet maybe we could return an empty optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its a trinary expression on purpose.
- null == all
- Empty == not efficiently a bitset
- Present == efficiently a bitset
| return new BitSetIterator(acceptBitSet, cardinality); | ||
| } | ||
| return liveDocs == null | ||
| ? scorerSupplier.get(NO_MORE_DOCS).iterator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we use NO_MORE_DOCS here instead of maxDoc ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the typical pattern. It simply indicates "Hey iterator, we have no ideas on the best skipping pattern, just assume we will iterate everything"
Here we have a new ESAcceptDocs implementation that doesn't fully construct the underlying bit set unless absolutely necessary, allowing for approximate cost to be calculated.
Also, this adds access to the underlying BitSet, as for some usages, being able to calculate
nextBitSetcan be supremely helpful.