-
Notifications
You must be signed in to change notification settings - Fork 190
ENH: Add templates.tsv and cohorts.tsv for template/cohort entity metadata #2287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I have to say it annoys me to see this after a lot of work to get there .. but damn it works, and it is consistent with other parts of BIDS .. nice +1 👍 I cannot see any reason not to adopt that change @oesteban what do you think? |
I understand Yarik's points. Also, his proposal is "idempotent" (in the sense that it doesn't create anything that cannot be completed/modified later down the line, as opposed to our _description.json file that once introduced is hard to get rid of). That said, Yarik's proposal opens a small hole for fields that require long descriptions, such as a custom license (that cannot be expressed with an identifier). Current BEP038 doesn't solve that problem (which is above its scope btw), but having JSON fields to stick that metadata is more reasonable than having them stuffed into the TSV column. |
isn't it a 80/20 rule concern? could likely be "custom: see LICENSE file" for those likely |
|
note that this one alone doesn't solve |
e15c610 to
5e1ac9b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2287 +/- ##
=======================================
Coverage 82.81% 82.81%
=======================================
Files 22 22
Lines 1693 1693
=======================================
Hits 1402 1402
Misses 291 291 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
It does apply, 80% of the licenses are custom. Universities used to weigh in and request nonstandard licenses be set. |
do we actually have some stats on that?
anyways, licence texts better live in files, not json or TSV, we have |
That sounds very reasonable to me 👍 |
…adata Introduce templates.tsv and cohorts.tsv files consistent with existing BIDS entity-specific TSV files (participants.tsv, sessions.tsv). - templates.tsv: placed at derivative dataset root, describes tpl-<label> entities - cohorts.tsv: placed within tpl-<label>/ directory, describes cohort-<label> entities This enables consistent documentation of template and cohort entities in derivative datasets. Part of BEP038 atlas metadata improvements. Schema changes: - Add template_id and cohort_id columns to columns.yaml - Add templates and cohorts suffixes to suffixes.yaml - Add Templates and Cohorts rules to common_derivatives.yaml - Add templates and cohorts file rules to tables.yaml - Update atlas.md with templates.tsv and cohorts.tsv documentation Related to: - #2285 - #2283 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
5e1ac9b to
9e967fb
Compare
src/schema/rules/tabular_data/derivatives/common_derivatives.yaml
Outdated
Show resolved
Hide resolved
effigies
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this seems reasonable. We should update at least one example to use these features.
on that aspect -- let's agree to use a standard form (I added above but overall point for here is that |
It is my subjective perception that licenses of atlases/templates follow the 80/20 rule (only 20 use standard licenses, and often with no-derivs restrictions, which is nuts for that kind of resource). |
oesteban
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks!
| `templates.tsv` example: | ||
|
|
||
| ```tsv | ||
| template_id description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| template_id description | |
| template_id long_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly suggest RRID here as a second column before "long_name"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for RRID, as always columns are optional but we know that when we put things in example they are more adopted than when we do not - might also consider DOI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now description is REQUIRED by the table definition, so either that needs to stay in addition to long_name or it needs to be demoted from REQUIRED. long_name would need to be defined in objects.columns and added to the table definition.
| `tpl-MNIPediatricAsym_cohorts.tsv` example: | ||
|
|
||
| ```tsv | ||
| cohort_id description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great 👍
effigies
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ongoing discussion. Cleared my approval until these are wrapped up.
| - template_id | ||
| - description__entities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - template_id | |
| - description__entities | |
| - template_id |
description being free-text, I would expect many people to choose to put it last for better alignment. I would drop the initial column requirement.
| description__entities: | ||
| level: required | ||
| description_addendum: | | ||
| The corresponding label column is `template_id`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make rrid show up in the table and also validate that it matches the RRID schema:
| description__entities: | |
| level: required | |
| description_addendum: | | |
| The corresponding label column is `template_id`. | |
| description__entities: | |
| level: required | |
| description_addendum: | | |
| The corresponding label column is `template_id`. | |
| rrid: optional |
This also requires adding rrid to objects.columns.
| - extension == ".tsv" | ||
| initial_columns: | ||
| - cohort_id | ||
| - description__entities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think this is overly prescriptive with column ordering.
| - description__entities |
| `templates.tsv` example: | ||
|
|
||
| ```tsv | ||
| template_id description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now description is REQUIRED by the table definition, so either that needs to stay in addition to long_name or it needs to be demoted from REQUIRED. long_name would need to be defined in objects.columns and added to the table definition.
| description: | | ||
| A TSV file describing labels found for the `cohort` entity within a template. | ||
| This file MUST be located within a `tpl-<label>/` directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need an explicit check to validate. Any TSV or JSON file can show up at the root without, even if it doesn't make much sense to do it.
| value: templates | ||
| display_name: Template Entity Definitions | ||
| description: | | ||
| A TSV file describing labels found for the `tpl` entity in a Derivatives dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a weird way of talking about it. I would say participants.tsv describes the participants, not the participant labels. Similarly with templates. Cohorts feels more reasonable to say labels.
Introduce templates.tsv and cohorts.tsv files consistent with existing BIDS entity-specific TSV files (participants.tsv, sessions.tsv).
This enables consistent documentation of template and cohort entities in derivative datasets. Part of BEP038 atlas metadata improvements.
Schema changes:
Related to:
[{leading entities}_]{entity_plural}.{tsv,json}file(s) #2283Part of the larger
atlas-<label>_description.jsonwithatlases.tsvpattern and also addtemplates.tsvandcohorts.tsv#2286for independent consideration. Attn @bids-standard/bep038