Skip to content

Conversation

@daniel-zamora
Copy link
Contributor

@daniel-zamora daniel-zamora commented Nov 20, 2025

Overview

What is the objective?

Fix KMS keyword validation to properly handle "Not provided" as a valid long-name value for Platforms, Instruments, and Projects. Previously, only nil long-names were handled correctly - any non-nil value (including "Not provided") would fail validation if the KMS entry lacked a long-name field.

What are the changes?

  • Added skip-long-name-validation? helper function that treats nil, empty strings, and "Not provided" (case-insensitive, with trimming) as equivalent
  • Refactored long-name validation logic into a new lookup-by-umm-c-keyword-with-long-name function that:
    • Removes long-name from comparison when it should be skipped
    • Removes long-name from KMS index keys during lookup to enable matching on remaining fields
    • Supports optional transform functions for keyword-specific transformations
  • Added dedicated lookup functions for Instruments and Projects (previously used default path)
  • Updated Platforms lookup to use the new shared helper function

What areas of the application does this impact?

common-app-lib: KMS keyword lookup service and validation logic)

Required Checklist

  • New and existing unit and int tests pass locally and remotely
  • clj-kondo has been run locally and all errors in changed files are corrected
  • I have commented my code, particularly in hard-to-understand areas
  • I have made changes to the documentation (if necessary)
  • My changes generate no new warnings

Additional Checklist

  • I have removed unnecessary/dead code and imports in files I have changed
  • I have cleaned up integration tests by doing one or more of the following:
    • migrated any are2 tests to are3 in files I have changed
    • de-duped, consolidated, removed dead int tests
    • transformed applicable int tests into unit tests
    • reduced number of system state resets by updating fixtures. Ex) (use-fixtures :each (ingest/reset-fixture {})) to be :once instead of :each

Summary by CodeRabbit

  • New Features

    • Added keyword lookup support for instruments and projects in collection metadata.
  • Bug Fixes

    • Improved handling of missing or "Not provided" long-name values in platform, instrument, and project keyword validation, enabling correct matching even when long-name data is incomplete.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

This change extends KMS lookup functionality to support long-name validation for platforms, instruments, and projects. It treats nil, blank, or "Not provided" long-names as optional by introducing generic lookup logic that conditionally strips long-name fields from comparison maps to enable correct matching.

Changes

Cohort / File(s) Summary
KMS Lookup Core Logic
common-app-lib/src/cmr/common_app/services/kms_lookup.clj
Adds private helper skip-long-name-validation? to identify ignorable long-names. Introduces generic lookup-by-umm-c-keyword-with-long-name function supporting optional transform callbacks and conditional long-name stripping. Extends platform lookup to use the new generic function with :type → :category transform. Creates three new public wrappers: lookup-by-umm-c-keyword-platforms, lookup-by-umm-c-keyword-instruments, lookup-by-umm-c-keyword-projects. Updates main dispatch to route :instruments and :projects to respective wrappers.
KMS Lookup Tests
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj
Expands test data with additional platform, instrument, and project entries. Adds comprehensive test groups validating long-name handling with blank, nil, empty, and "Not provided" values. Includes tests verifying matching behavior with and without long-name fields.
Integration Tests
system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj
Adds new test block "Blank and 'Not provided' long names should not cause validation errors" validating that collections with blank or "Not provided" long-names for platforms, instruments, and projects pass validation while invalid combinations fail with appropriate error messages.

Sequence Diagram

sequenceDiagram
    participant Lookup as lookup-by-umm-c-keyword
    participant Router as Dispatch Router
    participant Generic as lookup-by-umm-c-keyword-with-long-name
    participant Validator as skip-long-name-validation?
    participant Cache as KMS Cache
    
    Lookup->>Router: Receive :instruments/:projects
    Router->>Generic: Delegate with transform-fn
    Generic->>Validator: Check if long-name should skip
    alt Skip Long-name
        Generic->>Generic: Remove :long-name from comparison map
    else Keep Long-name
        Generic->>Generic: Use comparison map as-is
    end
    Generic->>Cache: Fetch with (normalized) comparison map
    Cache-->>Generic: Return keyed cache entry or nil
    Generic-->>Lookup: Return result or handle Carmine error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring extra attention:

  • Logic flow in lookup-by-umm-c-keyword-with-long-name for long-name stripping and conditional application
  • Verify that the transform function pattern (e.g., :type → :category for platforms) is correctly applied to all new wrappers
  • Confirm error handling for Carmine connection issues is consistent across all new code paths
  • Test edge cases for nil/blank/empty/"Not provided" long-name values across all three new lookups

Poem

🐰 Hops of joy through long-name fields,
Where "Not provided" no longer yields error shields,
Platforms, instruments, projects now align,
With flexible matching—nil, blank, or fine!
KMS lookups hop forward in time! 🎯

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description comprehensively covers the objective, detailed changes, and impacted areas. All required sections are present with substantial content, though some checklist items remain unchecked despite claims of completion.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title accurately and concisely describes the main change: refactoring long-name validation to handle 'Not provided' values for KMS keywords without causing validation errors.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch CMR-10801

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (1)

243-310: Long-name tests thoroughly exercise nil/blank/'Not provided' semantics

The new blank-and-not-provided-long-name-validation-test:

  • Verifies that nil/empty/"Not provided" all match KMS entries missing long-names for platforms, instruments, and projects.
  • Confirms that non-blank, non-"Not provided" long-names still fail, even when short-name is valid.
  • Reasserts CMRS-4400 behavior for platforms with long-names in KMS (PLAT1) for both nil and exact long-name cases.

This gives solid unit-level coverage of the new lookup behavior. If we ever need to widen guarantees, optional follow-up would be to add analogous “KMS has long-name, UMM says 'Not provided'” tests for instruments/projects, but that’s not required for this PR.

common-app-lib/src/cmr/common_app/services/kms_lookup.clj (1)

279-333: Generic long-name-aware lookup and wrappers are sound; consider minor doc/perf clarifications

The new lookup-by-umm-c-keyword-with-long-name plus the platform/instrument/project wrappers achieves:

  • Consistent normalization (kebab-case + optional transform) across the three keyword schemes.
  • Conditional stripping of :long-name on both the UMM keyword and KMS index when skip-long-name-validation? is true, so nil/blank/"Not provided" no longer cause false negatives.
  • Preservation of strict matching whenever :long-name is any other non-blank value, as verified by the new tests.

A couple of small, non-blocking follow-ups you might consider:

  • The remove-long-name-from-kms-index docstring still talks only about “nil” long-names; updating it to mention blank/"Not provided" and all three schemes would better reflect current use.
  • remove-long-name-from-kms-index rebuilds the entire scheme index map on each lookup when skipping long-names. This was already true for the older platform path, but it now applies to instruments/projects as well. If lookups on these schemes become hot paths, precomputing or caching a “no-long-name” variant per scheme could avoid repeated work.
  • If KMS ever contained multiple entries that differ only by :long-name for the same short-name (and other fields), stripping :long-name could collapse them; worth confirming that KMS schema/constraints prevent such duplicates.

These are refinements rather than correctness issues; the current behavior matches the tests and PR intent.

Also applies to: 335-357, 367-368

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21d5106 and 18888db.

📒 Files selected for processing (3)
  • common-app-lib/src/cmr/common_app/services/kms_lookup.clj (3 hunks)
  • common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (2 hunks)
  • system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj (1 hunks)
🔇 Additional comments (3)
system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj (1)

477-529: New 'Not provided' integration tests are well-targeted and aligned with existing patterns

The added tests cleanly cover:

  • Valid cases for Platforms/Instruments/Projects when long-name is "Not provided".
  • Negative cases ensuring arbitrary non-blank long names still fail even when short names are valid.

Assertions reuse existing helpers and message formats, so they’ll stay in sync with prior behavior. No changes needed here.

common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (1)

14-19: Sample KMS additions for PLAT2/INST2/PROJ2 look consistent

The extra platform/instrument/project entries cleanly model the “no long-name in KMS” scenarios without disturbing existing tests. Keys follow the same shape and casing as the rest of sample-map.

common-app-lib/src/cmr/common_app/services/kms_lookup.clj (1)

270-277: skip-long-name-validation? correctly captures the intended 'Not provided' semantics

The helper cleanly treats nil, blank, and case/whitespace-insensitive "Not provided" as “no long-name,” which matches the PR objective and keeps the behavior centralized. No changes needed here.

@daniel-zamora daniel-zamora changed the title CMR-10801: refactors longname validation to handle not provided CMR-10801: As a metadata steward, I should not get a CMR validation error for a keyword long name when it’s not provided and not required Nov 20, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 96.87500% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 58.13%. Comparing base (fe7ccb9) to head (18888db).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...app-lib/src/cmr/common_app/services/kms_lookup.clj 96.87% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2350      +/-   ##
==========================================
- Coverage   58.15%   58.13%   -0.02%     
==========================================
  Files        1063     1063              
  Lines       72169    72186      +17     
  Branches     2086     2081       -5     
==========================================
- Hits        41967    41963       -4     
- Misses      28264    28283      +19     
- Partials     1938     1940       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants