Inferring the somatic evolution of stem cell mutations in plants
- Mutations originating in stem cells at the shoot apex often become fixed in large sectors of the plant body due to cell lineage drift during repeated branching.
- Inferring the somatic evolution of such mutations requires knowledge of the effective stem cell population size, the cellular bottleneck strength during branch initiation, and the mutation rate.
- This repository provides statistical tools to estimate these parameters directly from cell-layer-specific DNA sequencing data.
Note: Details regarding the biological framework and theoretical model can be found in Johannes, bioRxiv https://doi.org/10.1101/2025.01.13.632685.
To get started, clone the repository to your project folder:
git clone https://github.com/jlab-code/samSFS.git
cd samSFS
This project is structured to work cleanly as an RStudio Project:
- Open RStudio.
- Go to File → Open Project...
- Select the
samSFS/
folder. - This ensures your working directory is set correctly and scripts will run without modification.
You can also create a .Rproj
file in the repo for convenience.
If you're not using RStudio, load the code manually in R:
# Load all core functions
files <- list.files("R", full.names = TRUE)
sapply(files, source)
# Load C++ backend
Rcpp::sourceCpp("src/simulate_processes.cpp")
samSFS/
├── R/ # Core R functions (contains all modular R core functions)
├── src/ # C++ simulation engine
├── data/ # Input data (contains example data from Goel et al. 2024)
├── results/ # Output folders created by scripts
├── demo/ # Demo scripts for real and simulated analyses
├── README.md # Project documentation
To run the full pipeline on real observed data:
- Clone the repo.
- In RStudio, go to File → Open Project and select the
samSFS/
folder. - Open
demo/test_real_data.R
and run the code line-by-line. - Output will be saved to:
results/real_data_results/
- Open the
samSFS
folder in RStudio. - Open and run the file
demo/test_real_data.R
. - Run line-by-line to understand each step.
- Results will be saved to:
results/real_data_results/
cd samSFS
Rscript demo/test_real_data.R
To run a full simulation + inference pipeline:
- Clone the repo.
- In RStudio, go to File → Open Project... and select the
samSFS/
folder. - Open
demo/test_simulations.R
and run the code line-by-line. - Output will be saved to:
results/test_simulation_output/
- Open the
samSFS
folder in RStudio. - Open and run the file
demo/test_simulations.R
. - Step through each line to understand simulation and inference steps.
- Output will be saved to:
results/test_simulation_output/
cd samSFS
Rscript demo/test_simulations.R
- All demo scripts assume you run them from the root
samSFS/
folder. - Each script sets thread usage, simulation parameters, and output paths explicitly.
- Results folders are automatically created during analysis.
For questions or collaboration, reach out via GitHub Issues or open a pull request.