QDR: Multiple transactions for permission reindexing #11822

qqmyers · 2025-09-16T19:53:20Z

What this PR does / why we need it: When a user is given a role on the root collection (or has that role removed), we reindex all the datasets and files within it. Prior to this PR, that was all done in one transaction. This PR breaks the operation into multiple batches in separate transactions. This reduces the time and memory needed for reindex the permissions.

At QDR, on a ~low-memory test machine, reindexing after a role assignment on root was taking ~5-6 minutes. After the PR, reindexing takes ~30-35 seconds. (We have also seen the server slow to a crawl doing this permission reindex (at least 10 times slower) when (presumably) other issues have caused very low memory conditions. We're hopeful that batching the work here will also lower overall memory requirements for reindexing perms on root and help to avoid it contributing to/ be less affected by very low memory).

Which issue(s) this PR closes:

I thought there was an open issue, but I've only found #10697, #10698.

Closes #

Special notes for your reviewer: By default, marking a method as requiring a new transaction does nothing if the method is being called from within the same bean. However, by "self-injecting" the bean into itself and then calling the method on the injected copy it works. I've used this technique here (suggested by AI). The alternative is to break the bean into parts and make calls between them.

Suggestions on how to test this:
Run the /api/admin/index/status command to confirm there are no indexing issues to start. (Runthe /api/admin/index/clear-orphans command if needed to remove perm docs in solr that aren't in the db).

Go to the root dataverse and assign/revoke a role with fine logging on this class. Check the logs for the run time. Then do the same thing after the PR and confirm it's significantly faster.

Verify that there are no / no new issues reported by /api/admin/index/status .

[update: I have already tested the PR on our prod. database clone, see my comment below - L.A.]

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

landreev

Scary, that we've been reindexing potentially huge numbers of datasets in one transaction here. I will try to tweak some assignment on root on the perf system, and measure before and after.

…mult-transactions

coveralls · 2025-09-25T18:36:14Z

coverage: 23.541% (+0.002%) from 23.539%
when pulling ef5065c on QualitativeDataRepository:QDR-perm-reindex-in-mult-transactions
into f79a02b on IQSS:develop.

landreev · 2025-09-25T21:00:01Z

Testing the "before" part on the perf system - i.e., watching the root datasets/sub-collections reindexed under 6.8-RC.
The rate it's going at is quite abysmal.

landreev · 2025-09-29T14:52:58Z

I tested the PR on the clone of the IQSS prod. db last week. This IS a huge potential problem for us.
I tried assigning a role on the root collection (30+K datasets to reindex).
With the develop build (6.8 release candidate), I had to give up on completing the reindexing. It went through ~6.1 K datasets in 24 hrs (!), and it was only getting slower (transaction context getting larger?). It started with ~1 sec./dataset:

  Progress : dataset 10650261 permissions reindexed in 1073 ms]]
  Progress : dataset 11704968 permissions reindexed in 1056 ms]]
  Progress : dataset 10650466 permissions reindexed in 1044 ms]]

and it was 10 sec. on average by the end of it:

  Progress : dataset 7437697 permissions reindexed in 9936 ms]]
  Progress : dataset 7438415 permissions reindexed in 10170 ms]]
  Progress : dataset 7438455 permissions reindexed in 9939 ms]]

That time also fluctuated significantly throughout the run; on account of the number of files in the dataset, I'm assuming. With the max. time of 162 sec. (!) recorded:

  Progress : dataset 6646655 permissions reindexed in 161889 ms]]

There are 40+ K files in the dataset above. For anyone who wants to nerd out over the exact numbers, the reindexing times for every dataset have been uploaded.

In short: this is clearly unacceptable. In addition to taking forever, it is entirely possible that it would eventually paralyze the db. It may not be a common real life scenario, changing assignments on root *). But we do have a real (non-harvested) collection with about 6K child datasets and it is very common, for a new user or group to be given an admin or curator etc. role on an active collection. *) we had to make an assignment on root too in recent memory, by giving members of a shib. group of trusted schools some extra rights.

Testing with this PR, the reindex ran at a constant speed and was completed in 100 minutes. (!!)
Thank you, Jim.
reindex_after_roleassignment_6.8-RC.txt

ofahimIQSS · 2025-10-16T14:13:23Z

Test on dataverse-internal (v6.8, Payara 6.2025.3) — PR #11822

Scope: exercised “permission reindex after role change on root”.

Root DV: id=1. Roles (root-scoped) include admin (1), member (8), fullContributor (3), etc.

Current assignments on root already include:

:authenticated-users → fullContributor (roleId=3) [id=1834]

@kc032221 → member (8) [id=1835] and admin (1) [id=16073]

@dataverseAdmin → admin (1) [id=1]

Attempt to add another assignment via API returned:

400 Bad Request: "Can't find role named 'null'" (alias/roleId parsing quirk when posting to /api/dataverses/1/assignments).

Despite the above, indexing is responsive:

/api/admin/index/status → "Index Status Batch Job initiated, check log for job status."
No errors surfaced; instance remained responsive.

Result: No regressions observed. Root has existing assignments sufficient to trigger the PR’s path; index status shows batch job initiation as expected. Recommend a small follow-up to harden the assignments API against alias/roleId ambiguity (avoid "role named 'null'" when roleId is supplied).

Merging

qqmyers added 4 commits September 16, 2025 15:30

break permission reindex into multiple transactions

be6ee1a

return void, update logging messages

858763b

lower logging to fine

113093c

release note

c15a8f6

qqmyers added this to IQSS Dataverse Project Sep 16, 2025

qqmyers moved this to Ready for Triage in IQSS Dataverse Project Sep 16, 2025

qqmyers added the Size: 3 A percentage of a sprint. 2.1 hours. label Sep 16, 2025

qqmyers added this to the 6.9 milestone Sep 16, 2025

landreev self-requested a review September 16, 2025 20:50

landreev moved this from Ready for Triage to In Review 🔎 in IQSS Dataverse Project Sep 17, 2025

scolapasta moved this from In Review 🔎 to Reviewed but Frozen ❄️ in IQSS Dataverse Project Sep 18, 2025

landreev approved these changes Sep 18, 2025

View reviewed changes

github-project-automation bot moved this from Reviewed but Frozen ❄️ to Ready for QA ⏩ in IQSS Dataverse Project Sep 18, 2025

scolapasta moved this from Ready for QA ⏩ to Reviewed but Frozen ❄️ in IQSS Dataverse Project Sep 22, 2025

cmbz added FY26 Sprint 6 FY26 Sprint 6 (2025-09-10 - 2025-09-24) FY26 Sprint 7 FY26 Sprint 7 (2025-09-24 - 2025-10-08) labels Sep 24, 2025

Merge remote-tracking branch 'IQSS/develop' into QDR-perm-reindex-in-…

ef5065c

…mult-transactions

landreev mentioned this pull request Sep 26, 2025

A local issue for tracking development issues in the main queue that are crucial for the stability of HDV IQSS/dataverse.harvard.edu#401

Open

6 tasks

pdurbin moved this from Reviewed but behind develop ⬅️ to Ready for QA ⏩ in IQSS Dataverse Project Sep 29, 2025

cmbz added the FY26 Sprint 8 FY26 Sprint 8 (2025-10-08 - 2025-10-22) label Oct 8, 2025

ofahimIQSS self-assigned this Oct 14, 2025

ofahimIQSS moved this from Ready for QA ⏩ to QA ✅ in IQSS Dataverse Project Oct 14, 2025

ofahimIQSS merged commit 7c476e2 into IQSS:develop Oct 16, 2025
14 checks passed

github-project-automation bot moved this from QA ✅ to Merged 🚀 in IQSS Dataverse Project Oct 16, 2025

ofahimIQSS removed their assignment Oct 16, 2025

scolapasta moved this from Merged 🚀 to Done 🧹 in IQSS Dataverse Project Oct 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

QDR: Multiple transactions for permission reindexing #11822

QDR: Multiple transactions for permission reindexing #11822

Uh oh!

qqmyers commented Sep 16, 2025 •

edited by landreev

Loading

Uh oh!

landreev left a comment

Uh oh!

coveralls commented Sep 25, 2025

Uh oh!

landreev commented Sep 25, 2025

Uh oh!

landreev commented Sep 29, 2025

Uh oh!

ofahimIQSS commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

QDR: Multiple transactions for permission reindexing #11822

QDR: Multiple transactions for permission reindexing #11822

Uh oh!

Conversation

qqmyers commented Sep 16, 2025 • edited by landreev Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

landreev left a comment

Choose a reason for hiding this comment

Uh oh!

coveralls commented Sep 25, 2025

Uh oh!

landreev commented Sep 25, 2025

Uh oh!

landreev commented Sep 29, 2025

Uh oh!

ofahimIQSS commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

qqmyers commented Sep 16, 2025 •

edited by landreev

Loading