-
Notifications
You must be signed in to change notification settings - Fork 8
Use of generators to reduce memory usage via a LazyMultiSet #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Use of generators to reduce memory usage via a LazyMultiSet #82
Conversation
…sing Co-authored-by: sam.willis <[email protected]>
…ility Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
…ance Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
Co-authored-by: sam.willis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM implementation-wise. However, i do think we should carefully benchmark performance and memory usage before merging as it does complicate the code quite a bit so i want to be 100% sure this is the way to go.
return new MultiSet(result) | ||
} | ||
|
||
*lazyJoin<V2>(other: Index<K, V2>): Generator<[[K, [V, V2]], number], void, unknown> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the duplication between both branches like i did here: ac24ff0
const consolidated = new MultiSet(values).consolidate() | ||
const sortedValues = consolidated | ||
.getInner() | ||
const consolidated = LazyMultiSet.fromArray(values).consolidate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why creating a lazy multiset to then consolidate it (which materializes it) to then immediately copy it into an array. Seems like a lot of overhead for something we could do by reducing once over the original array (i.e. loop over values
and consolidate it into an array of consolidated values).
}) | ||
}) | ||
|
||
it('should consolidate correctly', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also check the corner case where the multiplicities for a certain value cancel out and that the result does not contain the value.
Absolutely agree, stack traces and and performance traces of individual operators will likely get more complex. the key things I want is to check:
The latter will be hard to measure as it not guarantied to happen during execution. We need to do manual testing with performance traces and inspect what happens. Likely need to do this in the browser. Peak mem is a good indication of what GC needs to be done relative to the two versions. |
Also converts almost all operators to use them wherever possible.
We should benchmark this against main to see how what it does with the memory usage and allocations.