Skip to content

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Sep 20, 2024

Closes #1165. Adds + to the valid characters for the "label" format, which is currently only used for a subset of entities.

@tsalo tsalo added the schema Issues related to the YAML schema representation of the specification. Patch version release. label Sep 20, 2024
@tsalo tsalo requested a review from erdalkaraca as a code owner September 20, 2024 13:13
@tsalo
Copy link
Member Author

tsalo commented Sep 20, 2024

In this PR, plus signs are allowed for any label-format value, but I believe some folks (@effigies?) were thinking of limiting them to special multi-label cases. If we want to go down that route, then we'll need to create a new "format" (e.g., "multilabel") that applies to specific entities. I'm not 100% which entities we would want the multilabel format for... perhaps acq, desc, and space?

@effigies
Copy link
Collaborator

Reading #1165 (comment) and the following discussion, I think the overall consensus was toward permissiveness; i.e., allow in any label instead of splitting out a new multi-label concept.

I feel like the discussion about "semantics" was not entirely clear. I personally feel we should refrain from implying a relation between ent-X+Y and ent-X or ent-Y that tooling would need to respect. Perhaps we want to say something like:

Free-form labels with alphanumeric and plus (+) characters.
Plus signs MAY be used to concatenate multiple applicable labels,
but no relationship is established by a partial match.
In particular, the inheritance principle does not connect files
containing ent-X+Y and ent-X or ent-Y.

@tsalo
Copy link
Member Author

tsalo commented Sep 20, 2024

I like that!

Copy link
Collaborator

@oesteban oesteban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of nit picks

tsalo and others added 4 commits September 23, 2024 11:05
…ts to gain +

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "git-sedi '\\[0-9a-zA-Z\\]' '[0-9a-zA-Z+]'",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
r"(?:sub-(?P<subject>[0-9a-zA-Z]+)/)?"
r"(?:ses-(?P<session>[0-9a-zA-Z]+)/)?"
r"(?:sub-(?P<subject>[0-9a-zA-Z+]+)/)?"
r"(?:ses-(?P<session>[0-9a-zA-Z+]+)/)?"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid that is not all ... I will push a few commits on that end in a few minutes (I hope you don't mind).
Some other might need adjustment and I even start feeling that we might need to come up with some term (like "literal" but there might be better) to encompass "alphanumeric" and +.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yarikoptic Note this is a regression test that shows the specific output of a specific synthetic rule.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my comment is not really about this test -- I meant that changes in this PR (just this test) aren't sufficient. pushed now

@yarikoptic
Copy link
Collaborator

Just a sidenote FTR: There is also a question of either to allow it in suffixes (e.g. would converge DANDI layout closer (dandi/dandi-cli#1498) but since we do not have any "incident" of that in BIDS ATM, it is not appropriate to change ATM even though in general I see that suffixes also should be of the same nature as "labels", in particular that IIRC their values could "migrate" into _mod- entity.

@yarikoptic
Copy link
Collaborator

I pushed some changes which I think are due as well, although might need tune ups -- individual commits might have more information/reasoning in the commit messages. But there is also IMHO outstanding review and possibly tune up needed to src/metaschema.json

it does have plenty of "alphanumeric" and _ allowances and I am not yet 100% sure none of them somehow overlaps with "alphanumeric" possible in filename etc... most likely none, but still worth looking at IMHO
❯ git grep 'a-zA-Z0-9[^+]' -- src/metaschema.json
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9][a-zA-Z0-9_-]*$": {
src/metaschema.json:            "^_[a-zA-Z0-9_-]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:                "^[a-zA-Z0-9_]+$": {
src/metaschema.json:                    "^[a-zA-Z0-9_]+$": {
src/metaschema.json:                    "^[a-zA-Z0-9_]+$": {
src/metaschema.json:                    "^[a-zA-Z0-9_]+$": { "$ref": "#/definitions/suffixRule" }
src/metaschema.json:                    "^[a-zA-Z0-9_]+$": { "$ref": "#/definitions/suffixRule" }
src/metaschema.json:            "^[a-zA-Z0-9_]+$": {
src/metaschema.json:        "^[a-zA-Z0-9_]+$": {
src/metaschema.json:                "^[a-zA-Z0-9_]+$": {
src/metaschema.json:        "^[a-zA-Z0-9_]+$": {
src/metaschema.json:                "^[a-zA-Z0-9_]+$": {
src/metaschema.json:          "items": { "type": "string", "pattern": "^[a-zA-Z0-9]+$" }

Copy link
Collaborator

@Remi-Gau Remi-Gau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For something like this I would much prefer that one bids example is modified to showcase this so that downstream validator and parsers have got some food to sink their digital teeth into.

@tsalo
Copy link
Member Author

tsalo commented Sep 28, 2024

@Remi-Gau that's a great point. Do you think it makes more sense to create a new example dataset or to modify an existing one?

I took a quick look at the BIDS examples and https://github.com/bids-standard/bids-examples/tree/master/ds004332 looks promising. There are complex acq entities that seem to convey multiple discrete pieces of information. For example, acq-mpragePMCoff, acq-mpragePMCon, acq-t2starPMCoff, and acq-t2starPMCon could be changed to acq-mprage+pmc+off etc.

EDIT: If you think creating a new dataset makes more sense, then I think it would be good to use something like CUBIDS since that will use the acq entity to flag different kinds of variants (e.g., different numbers of volumes, different flip angles) in a dataset.

@Remi-Gau
Copy link
Collaborator

I am afraid that if we start adding a new dataset for every new aspect of the spec we need to validate we will end up with too many examples.

How about adding this to the synthetic example?

@tsalo
Copy link
Member Author

tsalo commented Sep 28, 2024

That works! I'll look into modifying https://github.com/bids-standard/bids-examples/tree/master/synthetic.

Copy link

codecov bot commented Jan 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.45%. Comparing base (f0e14a2) to head (abce472).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1926   +/-   ##
=======================================
  Coverage   82.45%   82.45%           
=======================================
  Files          17       17           
  Lines        1499     1499           
=======================================
  Hits         1236     1236           
  Misses        263      263           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, looks good and complete to me. Should we proceed with the merge?

@tsalo
Copy link
Member Author

tsalo commented Feb 5, 2025

I never got the bids example updated unfortunately. Maybe I could make time late next week to do it, but if someone else is willing to do it that would be great.

@yarikoptic
Copy link
Collaborator

@tsalo
Copy link
Member Author

tsalo commented Feb 10, 2025

@yarikoptic thanks for opening the examples PR! I think it's ready to merge now.

@tsalo
Copy link
Member Author

tsalo commented Feb 24, 2025

@Remi-Gau could you check this over one last time? I think you're the only reviewer who hasn't approved.

@tsalo
Copy link
Member Author

tsalo commented Feb 24, 2025

Thanks @Remi-Gau! Alright, does anyone have any issue with me merging? Otherwise, I'll merge tomorrow morning.

@tsalo tsalo merged commit 93be74e into bids-standard:master Feb 25, 2025
24 of 25 checks passed
@tsalo tsalo deleted the plus-sign-labels branch February 25, 2025 15:34
yarikoptic added a commit that referenced this pull request Mar 19, 2025
* origin/master: (76 commits)
  [pre-commit.ci] pre-commit autoupdate
  add Julia Pfarr as current BIDS maintainer
  [pre-commit.ci] pre-commit autoupdate
  [pre-commit.ci] pre-commit autoupdate
  fix
  fix
  [MAINT] deal with links redirects
  Temporarily disable dicomlookup urls check
  Fix URL to renamed now "Contributors" wiki
  doc: correct typos
  doc: remove mentioning of `schemacode` install as the third command
  enh: use macro for the physio template
  enh: add metaentities to rules README
  Apply suggestions from code review
  [ENH] Allow plus signs in labels (#1926)
  [pre-commit.ci] pre-commit autoupdate
  [MAINT] update link to new website
  sty: pyupgrade --py39-plus
  chore: Bump schema package to 1.0.4-dev
  chore: Bump schema package to 1.0.3
  ...

Conflicts:
	src/schema/objects/columns.yaml -- added the 3 pipette columns
marcelzwiers added a commit to Donders-Institute/bidscoin that referenced this pull request Jun 17, 2025
effigies pushed a commit to yarikoptic/BIDS-examples that referenced this pull request Aug 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema Issues related to the YAML schema representation of the specification. Patch version release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] Allow for - (dash) (and/or +) in <label>
5 participants