Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
297 changes: 297 additions & 0 deletions docs/pages/cv/classification_reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,297 @@
---
layout: page
title: "Metrics – Classification Reference"
permalink: /metrics/classification/
---

*Standardized semantic categories for PSI-MS quality control metrics.*

## Overview

Each QC metric in the [PSI-MS Controlled Vocabulary (CV)](https://github.com/HUPO-PSI/psi-ms-CV) is annotated using **seven independent classification dimensions**.
First, the **analytical dimension** defines *what kind of metric it is* and is encoded as **inheritance** using `is_a`.
The other six are **typed relationships** that describe *where the metric applies, what it depends on, how to interpret it,* and *how to serialize it in mzQC*.

**At a glance**

| Dimension | Encoded as | Purpose |
| ------------------------------- | -------------- | ------------------------------------------------------------------- |
| **Analytical dimension** | `is_a` | Defines the metric subtype (what kind of QC metric it *is*) |
| **Workflow stage** | relationship | Where in the experimental/computational pipeline the metric applies |
| **Information dependency type** | relationship | What type of input data the metric needs |
| **Measurement scope** | relationship | At what aggregation level the metric summarizes data |
| **Acquisition strategy** | relationship | Which acquisition/mode or platform it applies to |
| **Quality interpretation type** | relationship | How to interpret higher/lower/targeted values |
| **Metric value type** | relationship | How the values are structurally represented (single, tuple, etc.) |

**Rule of thumb:**
Every QC metric has exactly one `is_a` (analytical dimension) and one value from each of the other six relationship dimensions.

## Part 1 — Inheritance: Analytical dimension

**What it is:**
The analytical dimension defines the *type of QC metric*.
This is the only dimension expressed via **inheritance** (`is_a`) because it establishes the metric's place in the taxonomy.

**How to use it:**
Choose exactly one of the following as the metric's `is_a` parent:

#### Subclasses

- **Acquisition coverage metric:** how comprehensively data were collected (e.g., scan counts, sampling density).
- **Mass accuracy metric:** deviation between observed and theoretical _m_/_z_.
- **Intensity stability metric:** variation of signal intensity over time.
- **Chromatographic performance metric:** separation performance (e.g., eak width, symmetry, RT reproducibility).
- **Ionization quality metric:** properties of the precursor ion population (e.g., charge-state distribution, adduct prevalence).
- **Ion mobility metric:** IMS resolution, drift-time/CCS accuracy and reproducibility.
- **Spectral quality metric:** quality of individual spectra (e.g., peak density, S/N, completeness).
- **Fragmentation efficiency metric:** effectiveness of precursor ion fragmentation to produce interpretable spectra.
- **Isolation purity metric:** precursor isolation selectivity or co-isolation of interfering species.
- **Identification confidence metric:** reliability of identifications (e.g., FDR, ID rate).
- **Quantification precision metric:** reproducibility or variability of quantitative results.
- **Contamination metric:** unwanted signal from contaminants, carryover, or background.
- **Instrument operational performance metric:** general indicators of instrument health (e.g., vacuum, detector voltage, temperature).
- **Missingness/completeness metric:** data absence or completeness across features, runs, or studies.
Comment on lines +41 to +54
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has quite considerable overlap with the workflow stage and it's potentially a dimension which is hard to cover completely, since to me it's quite fuzzy. This may lead to people cramming their metric into something which "fit's good enough" instead of thinking of a new category value.

What could be a set of values which is

  1. easy to categorize
  2. does not need extension in the future
  3. adds value compared to the other dimensions

, or in other words: what could this dimension represent, making it distinct from others?
(so dimensions are ideally orthogonal)

Copy link
Collaborator Author

@bittremieux bittremieux Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point that there is significant overlap with the workflow stage. At the same time, I think that a metric subtype is still very useful and, for many users, the most intuitive entry point. It's probably not possible to be perfectly orthogonal to workflow stage.

However, we might want to reinterpret the analytical dimension so that it captures the fundamental quality phenomenon the metric measures, which could be high-level quality constructs from statistics such as: accuracy, precision, completeness, stability, etc. This would describe what kind of quality issue is being measured, not where it appears.


#### CV example (inheritance only)

```obo
is_a: MS:4000XXX ! chromatographic performance metric
```

## Part 2 — Typed relationships

The remaining six dimensions are not types; they are **properties** of a metric.
They must be encoded using the specified **relationship** predicates (one value per dimension).

### 1. Workflow stage — `part_of_workflow_stage`

**Definition:**
The experimental or computational stage of the workflow to which a QC metric applies.

This tells *where* in the process the metric is relevant — from sample preparation through acquisition and analysis.

#### Subclasses

**Experimental workflow stage**

Comment on lines +76 to +77
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a not applicable value here?
e.g. imagine a metric computing the coefficient of variation on the number of proteins across technical replicates.
What workflow stage is that?

Copy link
Collaborator Author

@bittremieux bittremieux Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a not applicable category, at least for this example. Ultimately, this metric originates from the data analysis workflow stage, because the underlying protein identifications only exist after spectrum annotation and protein inference.

Introducing not applicable would weaken the structure and risk being used as a catch-all category. Every metric ultimately traces back to a specific stage, and here that stage is identification.

Metrics describing quality at the laboratory or instrument level.

* **Sample preparation stage:** metrics describing sample handling, labeling, digestion, or storage quality.
*Example:* peptide recovery yield.
* **Chromatography stage:** metrics about LC separation performance.
*Example:* retention-time reproducibility, peak width.
Comment on lines +82 to +83
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one could add Direct injection stage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed? Does the absence of a certain workflow stage mandate explicit specification? I.e. is there a metric that exclusively depends on it being a direct injection experiment, or would metrics for such a experiment rather fall in some of the subsequent stages?

The chromatography stage needs to be made more general though, it should also be GC, not only LC.

* **Ionization stage:** metrics about ion generation and charge distribution.
*Example:* precursor charge-state fractions.
* **Ion mobility separation stage:** metrics describing the performance of gas-phase separation devices.
*Example:* ion-mobility resolution, CCS reproducibility.
* **Mass spectrometry acquisition stage:** metrics referring to scanning, detection, or data acquisition processes.
*Examples:* number of MS1 scans, duty-cycle stability.
Comment on lines +88 to +89
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could split mass spec stage into
ionization (already exists)
fragmentation stage (e.g. for collision energy related things, or fragment related things)
mass measurement stage (for mass drift related things etc)
intensity measurement stage (for detector related things)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fragmentation: we already have a fragmentation quality metric in the analytical dimension, so introducing a separate fragmentation stage would overlap with that. Fragmentation is also not a standalone workflow stage like chromatography or ionization, but is an operation within MS2 acquisition, which already covers this to some extent.

The same applies to the proposed mass measurement stage: it overlaps with the mass accuracy metric analytical dimensions and with the instrument calibration stage, blurring the line between what the metric measures versus where in the workflow it happens.

Likewise, an intensity measurement stage would duplicate what is already captured through intensity stability and ionization quality metrics in the analytical dimension.

So overall, with the current analytical dimension structure, splitting the mass spectrometry acquisition stage further is probably not desirable. But this discussion indicates that it's indeed tricky to differentiate between the analytical dimension and the workflow stage.


* **MS1 acquisition stage:** metrics that use or summarize MS1 data.
* **MS2 acquisition stage:** metrics that use or summarize MS2 data.
* **MSn acquisition stage:** metrics for higher-order fragmentation (MS³, etc.).
* **Instrument performance monitoring stage:** general metrics of instrument health and stability.
*Example:* mass-accuracy drift, spray stability.
* **Instrument calibration stage:** metrics derived from calibration routines or control samples.

Comment on lines +94 to +97
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave those out, since they are too general and subsumed by previous values

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the examples given in the description, where would those be placed instead when the categories are removed?

For instrument performance metrics, such as "vacuum pressure stability" or "detector voltage drift" (see e.g. my iMonDB paper), I don't think this fits under any of the previous categories. Similarly, calibration related metrics (e.g. "iRT calibration" or "mass calibration") are computed outside the normal acquisition workflow and don't nicely map to any of the previous workflow stages I think.

**Data analysis workflow stage**

Metrics evaluating computational processing and interpretation steps.

* **Data preprocessing stage:** metrics about baseline correction, noise removal, or peak picking.
* **Identification stage:** metrics assessing identification quality.
*Example:* PSM-level FDR, peptide identification rate.
* **Quantification stage:** metrics describing quantitative accuracy or precision.
*Example:* CV of peptide intensities, ratio reproducibility.
* **Integration stage:** metrics related to alignment, normalization, or data integration across runs.

**Environmental condition monitoring**

Metrics about environmental conditions that can indirectly affect results.
*Example:* laboratory temperature, humidity, power fluctuations.

#### CV example

```obo
relationship: part_of_workflow_stage MS:4000XXX ! chromatography stage
```

### 2. Information dependency type — `depends_on_data_type`

**Definition:**
Specifies which type of data input the metric requires to be computed.

#### Subclasses

* **Raw acquisition data:** metrics that can be calculated directly from the raw MS data, without identifications.
*Example:* total ion current stability, scan count.
* **Deconvoluted data:** metrics based on processed spectra or peak lists obtained after signal deconvolution, centroiding, or deisotoping, but prior to identification.
*Example:* peak density in deconvoluted spectra, precursor mass range coverage.
* **Identification results:** metrics that depend on identified peptides, compounds, or spectra.
*Example:* PSM-level FDR, peptide coverage.
* **Quantification results:** metrics derived from quantitative data matrices.
*Example:* CV of peptide intensities.
* **Hybrid:** metrics combining multiple data types (e.g., identification and quantification).
* **Reference data:** metrics requiring comparison to external standards or reference files.
*Example:* RT deviation vs. iRT peptides, calibration QC.*

#### CV example

```obo
relationship: depends_on_data_type MS:4000XXX ! raw acquisition data
```

### 3. Measurement scope — `has_measurement_scope`

**Definition:**
Indicates the level of data aggregation the metric summarizes.

#### Subclasses

* **Spectrum level:** per-spectrum metrics (e.g., number of peaks, S/N ratio).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're doing spectrum level, we must also do IM level (and technically chromatography as well) -- which would introduce redundancy with workflow stage again.

Maybe more abstract:

  1. MS peak (subsumes a pixel in imaging)
  2. spectrum
  3. IM Frame
  4. Chromatogram
  5. run level
  6. batch level
  7. study level

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me measurement scope should strictly capture aggregation granularity, not acquisition-specific constructs. Otherwise we indeed start introducing overlap with the workflow stage or acquisition strategy.

Maybe we need to add a most basic signal element level that covers the smallest data unit (centroid/profile peak, imaging pixel/voxel, IM bin/drift-time bin). Then the spectrum level can be broadened to include all acquisition events: MS1/MS2 spectra, IM frames or mobility scans, and MALDI shots. And then the remaining feature, run, batch, and study levels stay unchanged.

* **Pixel/voxel level:** per-pixel metrics in imaging or spatial omics.
* **Feature level:** per feature (e.g., peptide, compound, or chromatographic peak).
* **Run level:** aggregated per LC–MS run.
* **Batch level:** aggregated across multiple related runs.
* **Study level:** aggregated across an entire experiment or project.

#### CV example

```obo
relationship: has_measurement_scope MS:4000XXX ! run level
```

### 4. Acquisition strategy — `applies_to_acquisition_mode`

**Definition:**
Specifies which acquisition mode or instrument configuration the metric is relevant for.

#### Subclasses

**Acquisition mode**

* **Acquisition mode independent:** metrics valid for any acquisition method.
* **Data-dependent acquisition (DDA):** metrics specific to stochastic precursor selection workflows.
*Example:* number of MS2 spectra per precursor.
* **Data-independent acquisition (DIA):** metrics for window-based fragmentation strategies.
*Example:* precursor window purity.
* **Targeted acquisition:** metrics for SRM, PRM, or other targeted workflows.
*Example:* transition reproducibility.
* **Ion-mobility-coupled metric:** metrics derived from acquisition methods that include gas-phase ion mobility separation.
*Example:* TIMS mobility resolution (Δ1/K₀) per run.
Comment on lines +181 to +182
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if an explicit Ion-mobility value is needed/helpful, because it is orthogonal to DIA vs DDA,
and to be complete, we'd also need LC coupled metric (since this also adds another dimension, just as IM does) vs. direct injection.
We are mixing concepts here IMHO.
Aquisition mode (referring to the MassSpec itself) may just be

  1. any
  2. DDA
  3. DIA
  4. Targeted

other things like: LC, IM, Imaging are a different thing??
Even the specialized MSn is tricky, since TMT can be on MS3 level, yet still is DDA.

Now, one could just add multiple values for this dimension, but maybe we should push concepts like LC, IM etc into workflow stage? (where it would be duplicated anyways?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this mixes several orthogonal concepts: precursor selection logic (DDA/DIA/targeted), physical separation (LC, IM), and acquisition strategies (imaging, MSn)

Maybe a more coherent approach is to redefine this dimension strictly as "how ions are selected for fragmentation or measurement." In other words, capturing only the precursor selection logic of the mass spectrometer, independent of chromatography, IM, imaging, etc.

In that case, we can simplify the subclasses to:

  • Acquisition-mode independent
  • Data-dependent acquisition (DDA)
  • Data-independent acquisition (DIA)
  • Targeted acquisition

* **Imaging acquisition:** metrics for spatially resolved mass spectrometry experiments such as MALDI, DESI, or SIMS.
*Example:* pixel-to-pixel intensity variation across a tissue section.
* **Other specialized mode:** metrics for advanced or hybrid acquisition modes such as BoxCar, MSⁿ, or multiplexed scanning.
*Example:* BoxCar intensity uniformity across boxes.

**Instrument platform specificity**

* **Orbitrap-specific:** metrics only applicable to Orbitrap instruments.
*Example:* Orbitrap transient length stability.
* **TOF-specific:** metrics relevant to time-of-flight instruments.
*Example:* TOF detector voltage stability.
* **Ion-trap-specific:** metrics specific to trap-based systems.
*Example:* Ion trap fill time distribution.
* **Other platform-specific:** for quadrupoles, FT-ICR, or hybrid systems.

#### CV example

```obo
relationship: applies_to_acquisition_mode MS:4000XXX ! acquisition mode independent
```

### 5. Quality interpretation type — `has_quality_directionality`

**Definition:**
Describes how a metric's numeric value relates to overall quality.
This enables automatic reasoning about whether "higher," "lower," or "targeted" values represent better data.

#### Subclasses

* **Higher is better:** increasing values indicate improved quality.
*Example:* identification rate, mass accuracy score.
* **Lower is better:** decreasing values indicate improved quality.
*Example:* FDR, mass error.
* **Context dependent:** interpretation varies depending on method or range.
*Example:* precursor charge-state fractions, peak density.
* **Target range:** optimal quality corresponds to values within a defined interval.
*Example:* temperature, pressure, retention-time drift.
* **Categorical:** quality expressed as discrete categories (e.g., pass/fail, OK/warning/error).
* **Trend:** metrics intended for temporal monitoring rather than direct ranking (e.g., instrument drift over time).
Comment on lines +216 to +221
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find trend would fit into one of the other categories, e.g. instrument drift over time fits into lower is better.
Are there any counter examples where this is not the case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like instrument duty cycle variation? The magnitude itself is not necessarily crucial, but rather the pattern (growing oscillation indicates degradation). So here the time series aspect is relevant, rather than higher/lower/target range.

Although I agree that this is a pretty complex metric, and most can fit in any of the other categories as well.


#### CV example

```obo
relationship: has_quality_directionality MS:4000XXX ! lower is better
```

### 6. Metric value type — `has_value_type`

**Definition:**
Specifies the structural format of the metric's reported value(s).
This defines how the metric must be represented in mzQC.

#### Subclasses

| Type | Structure | Description | Example |
| ---------------- | ----------------- | ------------------------------------------------------------- | ----------------------------- |
| **Single value** | Scalar | A single numeric or categorical value. | Number of MS1 spectra |
| **Tuple** | Ordered list | Several ordered values of the same kind (e.g., quantiles). | XIC-FWHM quantiles |
| **Table** | Named columns | Parallel lists of equal length; each column has its own unit. | MS2 charge fractions |
| **Matrix** | Rectangular array | 2D array of homogeneous numeric values. | Ion-mobility intensity matrix |

See the [CV Term Usage Guide](../use/) for details on how each type is encoded in mzQC.

---

## Worked examples

### XIC-FWHM quantiles (tuple)

* **Analytical dimension (`is_a`)**: *chromatographic performance metric*
* **Workflow**: *chromatography stage*
* **Data type**: *raw acquisition data*
* **Scope**: *run level*
* **Acquisition**: *mode independent*
* **Directionality**: *lower is better*
* **Value type**: *tuple*

```obo
is_a: MS:4000XXX ! chromatographic performance metric
relationship: part_of_workflow_stage MS:4000XXX ! chromatography stage
relationship: depends_on_data_type MS:4000XXX ! raw acquisition data
relationship: has_measurement_scope MS:4000XXX ! run level
relationship: applies_to_acquisition_mode MS:4000XXX ! acquisition mode independent
relationship: has_quality_directionality MS:4000XXX ! lower is better
relationship: has_value_type MS:4000XXX ! tuple
```

### MS2 known precursor charge fractions (table)

* **Analytical dimension (`is_a`)**: *ionization quality metric*
* **Workflow**: *ionization stage*
* **Data type**: *raw acquisition data*
* **Scope**: *run level*
* **Acquisition**: *mode independent*
* **Directionality**: *context dependent*
* **Value type**: *table*

```obo
is_a: MS:4000XXX ! ionization quality metric
relationship: part_of_workflow_stage MS:4000XXX ! ionization stage
relationship: depends_on_data_type MS:4000XXX ! raw acquisition data
relationship: has_measurement_scope MS:4000XXX ! run level
relationship: applies_to_acquisition_mode MS:4000XXX ! acquisition mode independent
relationship: has_quality_directionality MS:4000XXX ! context dependent
relationship: has_value_type MS:4000XXX ! table
```

## Summary

* Use **`is_a`** for the **analytical dimension** (the metric's type).
* Use **typed relationships** (one each) for the six **orthogonal facets**: workflow stage, data dependency, scope, acquisition strategy, quality interpretation, and value type.

These relationships together provide a complete, machine-readable semantic description of any QC metric.

For how to serialize each **metric value type** in mzQC (single, tuple, table, matrix), see the **[CV Term Usage Guide](../use)**.
Loading
Loading