Unification and Benchmarking of Segmentation Methods for Spatial Transcriptomics

This repository includes the supplementary materials and code for the paper titled Unification and Benchmarking of Segmentation Methods for Spatial Transcriptomics. The research evaluates various methodologies in spatial transcriptomics, providing insights into their performance and offering recommendations for best practices in data analysis. Additionally, we present a Nextflow framework that serves as a baseline for future benchmarking efforts. This framework is highly adaptable and user-friendly, allowing for the seamless incorporation of new segmentation methods as they emerge in the field of spatial transcriptomics.

Table of contents:

Background
Methods
Datasets
Evaluation
Results
Installation
Usage

Background

Spatial transcriptomics has emerged as a pivotal technique for understanding tissue architecture and cellular interactions. However, the rapid development of various spatial transcriptomics methods necessitates rigorous benchmarking to guide researchers in selecting appropriate tools for their studies. This work systematically evaluates several spatial transcriptomics methods based on a variety of performance metrics.

Methods

The following methodologies were benchmarked:

Watershed: The watershed segmentation method utilizes multi-class Otsu thresholding and peak detection to accurately delineate nuclei in spatial transcriptomics images, effectively distinguishing them from the background.
Cellpose: CellPose employs a deep learning framework with a U-Net-like architecture to segment cells based on shape and internal structure, generating vector fields that refine segmentation results while improving image quality through noise reduction.
SCS: Subcellular Spatial Transcriptomics Cell Segmentation (SCS) integrates staining and transcriptomic data, utilizing a traditional watershed algorithm alongside a transformer neural network to accurately predict cellular relationships and enable detailed analyses of RNA localization.
Baysor: Baysor combines molecular position data with optional staining using a Bayesian mixture model and Markov Random Field approach, optimizing cell boundary delineation while maintaining spatial coherence and enhancing segmentation accuracy across various tissue conditions.
BIDCell: BidCell features a self-supervised deep learning framework with a Bidirectional U-Net3+ architecture that leverages biological insights to accurately segment cells in subcellular spatial transcriptomics without the need for manual annotations.
SAM: The Segment Anything Model (SAM) utilizes a Vision Transformer to perform real-time, prompt-based segmentation across diverse tasks, expanding its training dataset through a cycle of model-assisted data annotation for enhanced robustness.
SAM2: SAM2 builds upon the original SAM framework, incorporating streaming memory and iterative prompting capabilities for effective video and image segmentation, allowing for real-time object tracking and improved accuracy across complex content.

Datasets

Following datasets consist only on the formatted transcripts and images files, go to corresponding autor references for original datasets.

Dataset	Raw data link	Technology	scRNA-seq annotation reference data used	Download data
Brain	MOSTA	StereoSeq	Paper link	Transcripts and Image
Breast	10x Genomics	Xenium	Paper link	Transcripts and Image
Embrio	MOSTA	StereoSeq	Paper link	Transcripts and Image
Lung	Nanostring	CosMx	Paper link	Transcripts and Image
Pancreas	Nanostring	CosMx	Paper link	Transcripts and Image

Evaluation

The evaluation was performed using the five datasets presented. The metrics employed were derived from the BIDCell proposed metrics. You can find the code for evaluation and visualization generation in this repository.

Results

The findings highlight significant differences in performance across the evaluated methods, influencing the choice of method based on specific research questions and data characteristics. Detailed results, including comparisons and statistical analyses, are provided in the paper.

Installation

To run the code and reproduce the results, please ensure you have the following conda environments installed:

SCS: This conda environment is used for running SCS, Baysor and Watershed tools
Cellpose: This conda environment is used for running Cellpose
BIDCell: This conda environment is used for running BIDCell
Kernel: This conda environment is used for running the evaluation

In case you are interested in running the Nextflow pipeline (instead of a specific segmentation method) for running the corresponding scripts, ensure they are installed prior to run the Nextflow pipeline.

Usage

Running each segmentation method is possible with tool specific codes provided in this repository.

The Nextflow pipeline provides a scalable and user-friendly framework for benchmarking segmentation methods. Expanding the pipeline is straightforward—simply create a new entry for the tool you wish to benchmark and integrate it into the Nextflow workflow.

Running the Nextflow is also simple, just follow this steps:

Nextflow directory: Clone the Nextflow directory provided in this repository.
Change output directory: Change the output directory where outcomes should be stored in main.nf file.
Add input data: Add to the input_data folder the two required files, the trsanscripts and the image.
Run the Nextflow: Run the command $ nextflow run .../nextflow/main.nf

The pipeline will preprocess, generate the patches and run the added segmentation tools.

Contributing

Contributions to improve this repository are welcome! If you have suggestions or improvements, please open an issue or submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Evaluation		Evaluation
Nextflow		Nextflow
gaf		gaf
src		src
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unification and Benchmarking of Segmentation Methods for Spatial Transcriptomics

Background

Methods

Datasets

Evaluation

Results

Installation

Usage

Contributing

About

Releases

Packages

Languages

License

TranslationalBioinformaticsUnit/Benchmarking-SegmentationMethods-ST

Folders and files

Latest commit

History

Repository files navigation

Unification and Benchmarking of Segmentation Methods for Spatial Transcriptomics

Background

Methods

Datasets

Evaluation

Results

Installation

Usage

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages