-
Notifications
You must be signed in to change notification settings - Fork 123
8368875: [lworld] UseParallelGC fails null narrow klass assertion failure #1655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back coleenp! A progress list of the required criteria for merging this PR into |
@coleenp This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 184 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
bool must_be_preserved() const { | ||
return (!is_unlocked() || !has_no_hash() || (EnableValhalla && is_larval_state())); | ||
return (!is_unlocked() || !has_no_hash() || | ||
(EnableValhalla && (is_larval_state() || is_inline_type() || is_flat_array() || is_null_free_array()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't is_inline_type
imply is_larval_state
? (edit: other way around)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other way around, is_larval_state
implies is_inline_type
because only value classes instances can be in a larval state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I was decoding bits in my notebook yesterday. Would rather have them spelled out explicitly in case larval_state goes away or the bits change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fparain, I mixed it up when writing it down. I knew that, I promise 😅.
I think it would be good to introduce an is_valhalla_concept
(naming may vary) which ORs the larval, inline type, flat array and null free array bit masks into a single mask. That way we can check it in just one call, which is imo easier to read. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can consider this as a further enhancement. If we add valhalla bits (although I think we're out of bits), this would make sense to collect them all. I think all of this is probably optimized to one load and compare since it's inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_larval_state() is a status that is specific to a particular instance, and cannot be retrieved from another source, and this is why it must be preserved. is_inline_type(), is_flat_array() and is_null_free_array() are all properties than can be retrieved from the K-klass. So I'm not sure why they have been added here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug was that in parallelGC it is unsafe to retrieve the K-Klass to find these properties to re-initialize the moved object. The moved object has written over the markWord because it's not preserved. Adding the code that we must preserve the mark word because these bits matter fixes the bug because then the moved object will have these bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying to myself: The answer seems to be in the comment in oop.inline.hpp:98, but it would be nice to know the exact reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it would be good to have a GC expert explain this @albertnetymk ? Paul can follow up with him next week. I think it's inherent in the parallelCompaction algorithm that the klass is lost (and then found again?) at this point.
|
||
// This is for parallel gc, which doesn't always have the klass. | ||
// markWord::must_be_preserved preserves the original prototype header bits for EnableValhalla, | ||
// I don't know why serial gc doesn't work the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably find out why. I can ask around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! That would be good. The prototype_mark() function above or even init_mark shouldn't have to be conditional on EnableValhalla if the markWord is always preserved. But serial gc is very unhappy with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear I think this should be a further investigation because this should fix a lot of ParallelGC tests that are currently failing.
Thanks for reviewing, Fred. I think this will help with test results but still requires further research. |
Going to push as commit db38c83.
Your commit was automatically rebased without conflicts. |
This seems to pass multiple stressHierarchy test invocations with -XX:+UseParallelGC, and runtime/valhalla/inlinetypes tests. I think there's probably further investigation that needs to be done but hopefully this helps with the parallel GC test failures.
Testing with tier1-4.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1655/head:pull/1655
$ git checkout pull/1655
Update a local copy of the PR:
$ git checkout pull/1655
$ git pull https://git.openjdk.org/valhalla.git pull/1655/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 1655
View PR using the GUI difftool:
$ git pr show -t 1655
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1655.diff
Using Webrev
Link to Webrev Comment