Skip to content

Conversation

@sneumann
Copy link
Member

@sneumann sneumann commented Dec 9, 2025

No description provided.

@sneumann sneumann added this to the mzTab-M 2.1 milestone Dec 9, 2025
@philouail
Copy link
Collaborator

philouail commented Dec 9, 2025

related to #40 and #78

@philouail
Copy link
Collaborator

study-variable is a required field, I feel that would be the group should be too ? and even if only one group of study variable is reported ? that would just make things much clearer and i do not think it is an issue for writer as there will always be a "column name". what do you think ?

@sneumann
Copy link
Member Author

sneumann commented Dec 9, 2025

Commit 39f9c43 also addresses #78

@jorainer
Copy link
Collaborator

I have a more general comment: also when writing the documentation for the study variables, I realized that I actually find the naming of this variable and its attributes confusing.

Example:

study_variable[1]    female
study_variable[1]-assay_refs    assay[1]|assay[3]|assay[5]
study_variable[1]-average_function    [MS, MS:1002962, mean, ]
study_variable[1]-variation_function    [MS, MS:1002963, variation coefficient, ]
study_variable[1]-description    Sex of the participant

so, when I read this I would assume study_variable[1] is the value of the study variable (in the assays mentioned in assay_refs) - but the name of the variable is actually missing. Intuitively, I would maybe have defined this rather:

study_variable[1]-value    female
study_variable[1]-name    sex
study_variable[1]-assay_refs    assay[1]|assay[3]|assay[5]
study_variable[1]-average_function    [MS, MS:1002962, mean, ]
study_variable[1]-variation_function    [MS, MS:1002963, variation coefficient, ]
study_variable[1]-description    Sex of the participant

i.e. have a dedicated value attribute and a name attribute. that would feel a bit more natural to me. I'm also OK with having group instead, but would find that maybe a bit less clear? So, the solution in this PR (as far as I understand) defines this:

study_variable[1]    female
study_variable[1]-group    sex
study_variable[1]-assay_refs    assay[1]|assay[3]|assay[5]
study_variable[1]-average_function    [MS, MS:1002962, mean, ]
study_variable[1]-variation_function    [MS, MS:1002963, variation coefficient, ]
study_variable[1]-description    Sex of the participant

@nilshoffmann
Copy link
Member

group should have the Parameter type. This will allow both CV Parameters and user parameters possible!
[CV, CV:1212312, name, value] -> CV Parameter
[,,"sex",] -> User Parameter

@nilshoffmann
Copy link
Member

Can we check the proposed naming against the one used by MetaboLights, so that we try to avoid adding more confusion.

@nilshoffmann
Copy link
Member

Suggested redesign, introducing a new top level group for study_variables:

study_variable_group[1]    [,,sex,]
study_variable_group[1]-description    ASjklhajksd
study_variable_group[1]-type   nominal
(study_variable_group[1]-unit   [,,,])? // e.g. day, time, concentration, ...
...
study_variable[1]    [,,female,]
study_variable[1]-group_ref    study_variable_group[1]
study_variable[1]-assay_refs    assay[1]|assay[3]|assay[5]

@jorainer
Copy link
Collaborator

I very much like the new format with an explicit study_variable_group. I think it would be great to get that implemented ASAP.

for study_variable_group[1]-type it would be good to have a consistent set of supported/expected values. don't know if there is a CV for that. But values that would come to my mind are (and the respective data types I would encode/decode that from R):

"nominal" -> use as.is in R (e.g. number to numeric, text to character)
"ordinal" -> convert to R factor (?)
"categorical" -> convert to R factor - question is just how to define the base level. easiest would be to have them ordered alphabetically.
"logical" -> logical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants