Skip to content

Conversation

coleenp
Copy link
Contributor

@coleenp coleenp commented Oct 3, 2025

This seems to pass multiple stressHierarchy test invocations with -XX:+UseParallelGC, and runtime/valhalla/inlinetypes tests. I think there's probably further investigation that needs to be done but hopefully this helps with the parallel GC test failures.
Testing with tier1-4.


Progress

  • Change must not contain extraneous whitespace

Issue

  • JDK-8368875: [lworld] UseParallelGC fails null narrow klass assertion failure (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1655/head:pull/1655
$ git checkout pull/1655

Update a local copy of the PR:
$ git checkout pull/1655
$ git pull https://git.openjdk.org/valhalla.git pull/1655/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1655

View PR using the GUI difftool:
$ git pr show -t 1655

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1655.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 3, 2025

👋 Welcome back coleenp! A progress list of the required criteria for merging this PR into lworld will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 3, 2025

@coleenp This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8368875: [lworld] UseParallelGC fails null narrow klass assertion failure

Reviewed-by: fparain

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 184 new commits pushed to the lworld branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the lworld branch, type /integrate in a new comment.

bool must_be_preserved() const {
return (!is_unlocked() || !has_no_hash() || (EnableValhalla && is_larval_state()));
return (!is_unlocked() || !has_no_hash() ||
(EnableValhalla && (is_larval_state() || is_inline_type() || is_flat_array() || is_null_free_array())));
Copy link
Member

@Arraying Arraying Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't is_inline_type imply is_larval_state? (edit: other way around)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other way around, is_larval_state implies is_inline_type because only value classes instances can be in a larval state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I was decoding bits in my notebook yesterday. Would rather have them spelled out explicitly in case larval_state goes away or the bits change.

Copy link
Member

@Arraying Arraying Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fparain, I mixed it up when writing it down. I knew that, I promise 😅.

I think it would be good to introduce an is_valhalla_concept (naming may vary) which ORs the larval, inline type, flat array and null free array bit masks into a single mask. That way we can check it in just one call, which is imo easier to read. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can consider this as a further enhancement. If we add valhalla bits (although I think we're out of bits), this would make sense to collect them all. I think all of this is probably optimized to one load and compare since it's inline.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_larval_state() is a status that is specific to a particular instance, and cannot be retrieved from another source, and this is why it must be preserved. is_inline_type(), is_flat_array() and is_null_free_array() are all properties than can be retrieved from the K-klass. So I'm not sure why they have been added here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug was that in parallelGC it is unsafe to retrieve the K-Klass to find these properties to re-initialize the moved object. The moved object has written over the markWord because it's not preserved. Adding the code that we must preserve the mark word because these bits matter fixes the bug because then the moved object will have these bits.

Copy link
Collaborator

@fparain fparain Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replying to myself: The answer seems to be in the comment in oop.inline.hpp:98, but it would be nice to know the exact reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be good to have a GC expert explain this @albertnetymk ? Paul can follow up with him next week. I think it's inherent in the parallelCompaction algorithm that the klass is lost (and then found again?) at this point.


// This is for parallel gc, which doesn't always have the klass.
// markWord::must_be_preserved preserves the original prototype header bits for EnableValhalla,
// I don't know why serial gc doesn't work the same.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably find out why. I can ask around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! That would be good. The prototype_mark() function above or even init_mark shouldn't have to be conditional on EnableValhalla if the markWord is always preserved. But serial gc is very unhappy with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear I think this should be a further investigation because this should fix a lot of ParallelGC tests that are currently failing.

@coleenp coleenp marked this pull request as ready for review October 3, 2025 17:34
@openjdk openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 3, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 3, 2025

Webrevs

@coleenp
Copy link
Contributor Author

coleenp commented Oct 3, 2025

Thanks for reviewing, Fred. I think this will help with test results but still requires further research.
/integrate

@openjdk
Copy link

openjdk bot commented Oct 3, 2025

Going to push as commit db38c83.
Since your change was applied there have been 184 commits pushed to the lworld branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 3, 2025
@openjdk openjdk bot closed this Oct 3, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 3, 2025
@openjdk
Copy link

openjdk bot commented Oct 3, 2025

@coleenp Pushed as commit db38c83.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@coleenp coleenp deleted the stressHierarchy branch October 3, 2025 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants