Skip to content

Conversation

no-defun-allowed
Copy link
Collaborator

This PR adds regions to the Compressor. The benefits are that the collector compacts each region in parallel, and the collector supports the use of discontiguous heaps (as required for compressed oops in OpenJDK).

@no-defun-allowed
Copy link
Collaborator Author

Two issues come to mind:

In the original PR for the Compressor, I modified the CI script to skip some tests on garbage collectors which don't support discontiguous heaps (i.e. only the Compressor). There are no such collectors now, so should I remove that special casing?

I added a RegionPageResource to allow the BumpAllocator to allocate from a region-structured heap. All allocations take a lock, which I'm somewhat unhappy with, as other page resources appear to be lock-free until they need to grow the heap. But the Compressor still has the best mutator time on DaCapo, so I'm not sure if it is worthwhile to remove the lock.

@qinsoon
Copy link
Member

qinsoon commented Aug 20, 2025

Two issues come to mind:

In the original PR for the Compressor, I modified the CI script to skip some tests on garbage collectors which don't support discontiguous heaps (i.e. only the Compressor). There are no such collectors now, so should I remove that special casing?

Yeah. If no other code needs 'discontiguous', preferably it can be cleaned up.

I added a RegionPageResource to allow the BumpAllocator to allocate from a region-structured heap. All allocations take a lock, which I'm somewhat unhappy with, as other page resources appear to be lock-free until they need to grow the heap. But the Compressor still has the best mutator time on DaCapo, so I'm not sure if it is worthwhile to remove the lock.

If there is no measurable performance issue, I would think it is fine for now.

Just to clarify, how is a RegionPageResource different from other existing page resources? To my understanding, it returns the required arbitrary number of pages to the space, but it 'internally' organizes all the used memory as regions of the same sizes. BlockPageResource only returns blocks (which are regions of the same sizes), and other page resource does not organize used memory as regions of the same sizes. Is this understanding correct?

@no-defun-allowed
Copy link
Collaborator Author

no-defun-allowed commented Aug 21, 2025

Just to clarify, how is a RegionPageResource different from other existing page resources?

I think your understanding is right. To elaborate, the RegionPageResource structures the heap into regions, and has a bump allocator for each region. The RegionPageResource can handle requests for any number of pages smaller than a region; but requests probably should be smaller than the region size by a decent margin, to avoid fragmentation. The regions in the Compressor are 1MiB, though that size was picked arbitrarily, and I don't know if that size is a good size for any idea of "good".

RegionPageResource allows the GC design to assume that objects never span multiple regions when they are allocated, which is necessary to compact each region separately; and that the objects in a region will be contiguously allocated, which we utilise in only scanning for objects between the start and allocation cursor of a region (though I haven't measured how much this matters). The Compressor and RegionPageResource together also maintain allocation order in each region.

@no-defun-allowed no-defun-allowed marked this pull request as ready for review August 22, 2025 06:18
@qinsoon qinsoon self-requested a review August 25, 2025 05:29
@qinsoon qinsoon added the PR-extended-testing Run extended tests for the pull request label Aug 25, 2025
@qinsoon
Copy link
Member

qinsoon commented Aug 25, 2025

Just out of curiosity, what is performance like for this PR? It makes the offset calculation and the actual copying parallelized. I would expect it to improve STW time.

/// A region in a RegionPageResource and its allocation cursor.
pub struct RegionAllocator<R: Region> {
pub region: R,
cursor: AtomicUsize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure Atomic<Address> also works. It'll simplify the back-and-forth you do with usize. Though hopefully Atomic<Address> doesn't do anything silly on other platforms.

@no-defun-allowed
Copy link
Collaborator Author

Just out of curiosity, what is performance like for this PR? It makes the offset calculation and the actual copying parallelized. I would expect it to improve STW time.

I don't have apples-to-apples results handy - I have benchmark results with compressed oops for the regional Compressor, but none without compressed oops.

@k-sareen
Copy link
Collaborator

Just out of curiosity, what is performance like for this PR? It makes the offset calculation and the actual copying parallelized. I would expect it to improve STW time.

I don't have apples-to-apples results handy - I have benchmark results with compressed oops for the regional Compressor, but none without compressed oops.

The comparisons against SemiSpace, Immix, etc. would still be interesting to see. I don't think Yi would have seen those results.

@no-defun-allowed
Copy link
Collaborator Author

no-defun-allowed commented Aug 26, 2025

Just out of curiosity, what is performance like for this PR? It makes the offset calculation and the actual copying parallelized. I would expect it to improve STW time.

I don't have apples-to-apples results handy - I have benchmark results with compressed oops for the regional Compressor, but none without compressed oops.

The comparisons against SemiSpace, Immix, etc. would still be interesting to see. I don't think Yi would have seen those results.

Right. I have such comparisons on plotty: http://squirrel.anu.edu.au/plotty/hayleyp/plots/p/CDBtDw

(Edited because I put the wrong logs in - I re-benchmarked with the minheaps for Compressor, as those are the smallest.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-extended-testing Run extended tests for the pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants