Skip to content

Conversation

@bittremieux
Copy link
Collaborator

Documentation related to the formalized metrics classification proposed during the ELIXIR BioHackathon and discussed in the PSI-MS CV here.

  • Add new documentation page with details on the metrics classification.
  • Update CV usage and term request guide to reflect this.

@bittremieux bittremieux mentioned this pull request Nov 5, 2025
7 tasks
@mwalzer
Copy link
Collaborator

mwalzer commented Nov 28, 2025

I think it would also be good to document from which CV version this is consistently implemented?

@mwalzer
Copy link
Collaborator

mwalzer commented Nov 28, 2025

Coming back to the telco arguments of adoption of these classifications from requestors/requestees of new metrics, I think if we make this document the central point of entry or at least very hard to not stumble over and give it a clear structure with TOC etc. I think many requestors/requestees care for (the success of ) their metric and will put in the extra effort.

Copy link
Collaborator

@cbielow cbielow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this came out more tricky than anticipated.

More input is highly appreciated!

Comment on lines +181 to +182
* **Ion-mobility-coupled metric:** metrics derived from acquisition methods that include gas-phase ion mobility separation.
*Example:* TIMS mobility resolution (Δ1/K₀) per run.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if an explicit Ion-mobility value is needed/helpful, because it is orthogonal to DIA vs DDA,
and to be complete, we'd also need LC coupled metric (since this also adds another dimension, just as IM does) vs. direct injection.
We are mixing concepts here IMHO.
Aquisition mode (referring to the MassSpec itself) may just be

  1. any
  2. DDA
  3. DIA
  4. Targeted

other things like: LC, IM, Imaging are a different thing??
Even the specialized MSn is tricky, since TMT can be on MS3 level, yet still is DDA.

Now, one could just add multiple values for this dimension, but maybe we should push concepts like LC, IM etc into workflow stage? (where it would be duplicated anyways?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this mixes several orthogonal concepts: precursor selection logic (DDA/DIA/targeted), physical separation (LC, IM), and acquisition strategies (imaging, MSn)

Maybe a more coherent approach is to redefine this dimension strictly as "how ions are selected for fragmentation or measurement." In other words, capturing only the precursor selection logic of the mass spectrometer, independent of chromatography, IM, imaging, etc.

In that case, we can simplify the subclasses to:

  • Acquisition-mode independent
  • Data-dependent acquisition (DDA)
  • Data-independent acquisition (DIA)
  • Targeted acquisition

Comment on lines +41 to +54
- **Acquisition coverage metric:** how comprehensively data were collected (e.g., scan counts, sampling density).
- **Mass accuracy metric:** deviation between observed and theoretical _m_/_z_.
- **Intensity stability metric:** variation of signal intensity over time.
- **Chromatographic performance metric:** separation performance (e.g., eak width, symmetry, RT reproducibility).
- **Ionization quality metric:** properties of the precursor ion population (e.g., charge-state distribution, adduct prevalence).
- **Ion mobility metric:** IMS resolution, drift-time/CCS accuracy and reproducibility.
- **Spectral quality metric:** quality of individual spectra (e.g., peak density, S/N, completeness).
- **Fragmentation efficiency metric:** effectiveness of precursor ion fragmentation to produce interpretable spectra.
- **Isolation purity metric:** precursor isolation selectivity or co-isolation of interfering species.
- **Identification confidence metric:** reliability of identifications (e.g., FDR, ID rate).
- **Quantification precision metric:** reproducibility or variability of quantitative results.
- **Contamination metric:** unwanted signal from contaminants, carryover, or background.
- **Instrument operational performance metric:** general indicators of instrument health (e.g., vacuum, detector voltage, temperature).
- **Missingness/completeness metric:** data absence or completeness across features, runs, or studies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has quite considerable overlap with the workflow stage and it's potentially a dimension which is hard to cover completely, since to me it's quite fuzzy. This may lead to people cramming their metric into something which "fit's good enough" instead of thinking of a new category value.

What could be a set of values which is

  1. easy to categorize
  2. does not need extension in the future
  3. adds value compared to the other dimensions

, or in other words: what could this dimension represent, making it distinct from others?
(so dimensions are ideally orthogonal)

Copy link
Collaborator Author

@bittremieux bittremieux Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point that there is significant overlap with the workflow stage. At the same time, I think that a metric subtype is still very useful and, for many users, the most intuitive entry point. It's probably not possible to be perfectly orthogonal to workflow stage.

However, we might want to reinterpret the analytical dimension so that it captures the fundamental quality phenomenon the metric measures, which could be high-level quality constructs from statistics such as: accuracy, precision, completeness, stability, etc. This would describe what kind of quality issue is being measured, not where it appears.

Comment on lines +82 to +83
* **Chromatography stage:** metrics about LC separation performance.
*Example:* retention-time reproducibility, peak width.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one could add Direct injection stage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed? Does the absence of a certain workflow stage mandate explicit specification? I.e. is there a metric that exclusively depends on it being a direct injection experiment, or would metrics for such a experiment rather fall in some of the subsequent stages?

The chromatography stage needs to be made more general though, it should also be GC, not only LC.

Comment on lines +88 to +89
* **Mass spectrometry acquisition stage:** metrics referring to scanning, detection, or data acquisition processes.
*Examples:* number of MS1 scans, duty-cycle stability.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could split mass spec stage into
ionization (already exists)
fragmentation stage (e.g. for collision energy related things, or fragment related things)
mass measurement stage (for mass drift related things etc)
intensity measurement stage (for detector related things)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fragmentation: we already have a fragmentation quality metric in the analytical dimension, so introducing a separate fragmentation stage would overlap with that. Fragmentation is also not a standalone workflow stage like chromatography or ionization, but is an operation within MS2 acquisition, which already covers this to some extent.

The same applies to the proposed mass measurement stage: it overlaps with the mass accuracy metric analytical dimensions and with the instrument calibration stage, blurring the line between what the metric measures versus where in the workflow it happens.

Likewise, an intensity measurement stage would duplicate what is already captured through intensity stability and ionization quality metrics in the analytical dimension.

So overall, with the current analytical dimension structure, splitting the mass spectrometry acquisition stage further is probably not desirable. But this discussion indicates that it's indeed tricky to differentiate between the analytical dimension and the workflow stage.

Comment on lines +94 to +97
* **Instrument performance monitoring stage:** general metrics of instrument health and stability.
*Example:* mass-accuracy drift, spray stability.
* **Instrument calibration stage:** metrics derived from calibration routines or control samples.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave those out, since they are too general and subsumed by previous values

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the examples given in the description, where would those be placed instead when the categories are removed?

For instrument performance metrics, such as "vacuum pressure stability" or "detector voltage drift" (see e.g. my iMonDB paper), I don't think this fits under any of the previous categories. Similarly, calibration related metrics (e.g. "iRT calibration" or "mass calibration") are computed outside the normal acquisition workflow and don't nicely map to any of the previous workflow stages I think.

Comment on lines +76 to +77
**Experimental workflow stage**

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a not applicable value here?
e.g. imagine a metric computing the coefficient of variation on the number of proteins across technical replicates.
What workflow stage is that?

Copy link
Collaborator Author

@bittremieux bittremieux Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a not applicable category, at least for this example. Ultimately, this metric originates from the data analysis workflow stage, because the underlying protein identifications only exist after spectrum annotation and protein inference.

Introducing not applicable would weaken the structure and risk being used as a catch-all category. Every metric ultimately traces back to a specific stage, and here that stage is identification.


#### Subclasses

* **Spectrum level:** per-spectrum metrics (e.g., number of peaks, S/N ratio).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're doing spectrum level, we must also do IM level (and technically chromatography as well) -- which would introduce redundancy with workflow stage again.

Maybe more abstract:

  1. MS peak (subsumes a pixel in imaging)
  2. spectrum
  3. IM Frame
  4. Chromatogram
  5. run level
  6. batch level
  7. study level

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me measurement scope should strictly capture aggregation granularity, not acquisition-specific constructs. Otherwise we indeed start introducing overlap with the workflow stage or acquisition strategy.

Maybe we need to add a most basic signal element level that covers the smallest data unit (centroid/profile peak, imaging pixel/voxel, IM bin/drift-time bin). Then the spectrum level can be broadened to include all acquisition events: MS1/MS2 spectra, IM frames or mobility scans, and MALDI shots. And then the remaining feature, run, batch, and study levels stay unchanged.

Comment on lines +216 to +221
* **Context dependent:** interpretation varies depending on method or range.
*Example:* precursor charge-state fractions, peak density.
* **Target range:** optimal quality corresponds to values within a defined interval.
*Example:* temperature, pressure, retention-time drift.
* **Categorical:** quality expressed as discrete categories (e.g., pass/fail, OK/warning/error).
* **Trend:** metrics intended for temporal monitoring rather than direct ranking (e.g., instrument drift over time).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find trend would fit into one of the other categories, e.g. instrument drift over time fits into lower is better.
Are there any counter examples where this is not the case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like instrument duty cycle variation? The magnitude itself is not necessarily crucial, but rather the pattern (growing oscillation indicates degradation). So here the time series aspect is relevant, rather than higher/lower/target range.

Although I agree that this is a pretty complex metric, and most can fit in any of the other categories as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants