Skip to content

Latest commit

 

History

History
152 lines (109 loc) · 6.78 KB

README.md

File metadata and controls

152 lines (109 loc) · 6.78 KB

RAPToR R package

RAPToR (Real Age Prediction from Transcriptome staging on Reference) is a tool to accurately predict the developmental age of individual samples from their gene expression profiles.

We stage samples on high-resolution references built from existing developmental profiling time-series. Inferred age can then be used in multiple ways to precisely estimate perturbations effects on developmental timing, increase power in differential expression analyses, estimate differential expression due to uncontrolled development and most importantly, to recover perturbation specific effects on gene expression even in the extreme scenario when the perturbation is completely confounded by development.

Please cite our paper if you use RAPToR in your research:

Installation

To install the latest version of RAPToR, run the following in your R console :

if (!requireNamespace("remotes", quietly = TRUE))
    install.packages("remotes")
remotes::install_github("LBMC/RAPToR", build_vignettes = T)

When dependencies are met, installation should take under 20 seconds.

Dependencies

Users can choose to install the RAPToR package dependencies manually from an R console:

# CRAN packages
install.packages(c("ica", "mgcv", "parallel", "data.table", "pryr", "beeswarm", "Rdpack", "R.rsp"))

# Bioconductor packages
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("limma")

We also recommend to install the following packages used in RAPToR vignettes to download demo data:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("GEOquery", "biomaRt"))

System requirements

We have verified RAPToR works with R v3.6.3, v4.1.1, and v4.1.2 on Unix (Ubuntu 18.04/20.04/22.04 LTS), Windows 10, and macOS (10.14) systems.

Standard datasets can easily run with 4Gb of RAM and 2 CPU cores. For reference, the GSE80157 (dsaeschimann2017) dataset used for demo in the main vignette (43 samples by ~19500 genes) can be both downloaded and staged with RAPToR in under 30 seconds, using less than 2Gb of RAM.

Getting started

You can access the package's main vignette from your R console with

library(RAPToR)

vignette("RAPToR")
# or
vignette("RAPToR-pdf")

How does it work ?

RAPToR is a 2-step process:

  1. A reference gene expression time-series is interpolated to build a near-continuous, high-temporal-resolution reference (a number of which are included in associated data-packages, see below).
  2. A correlation profile of each of your samples against this reference is dressed, and the timing of the correlation peak is the estimated age. Bootstrapping on genes then gives a confidence interval of the estimates.

tool_overview

What data can be used ?

The RAPToR package allows you to estimate the developmental age of individual samples from their gene expression profiles. This means that any method outputting information on gene expression on a large scale is appropriate : RNA-seq (preferably TPM), MicroArray...

Data must not be gene-centered, as this destroys the relationship between gene levels within a sample.

Current available data-packages

We recommend you get our data-packages with pre-built references of common organisms for quick & easy usage.




Update info

v1.2

Please also update reference data-packages to their latest version to work with this version of RAPToR

  • Introduced a reference (ref) object and corresponding make_ref() and print functions. ref objects to
    • make reference-building more direct, from a geim object,
    • simplify the age estimation call (now simply ae(samp, ref)),
    • store reference metadata (such as the reference time unit) for use in subsequent plotting/printing.
  • Updated ae object printing to include reference metadata when available.
  • Optimized ae bootstrap correlation (>2x faster)
  • Rewrote ae plotting function to
  • display bootstrap estimates by default.
  • include reference time units when available.
  • fix ae plotting graphics bug (larger first plot with overlayed and missing elements) when displaying multiple plots side by side.
  • add sample label control (truncate, adjust margin space)
  • cover more parameter edge cases
  • Updated geim printing to include reference metadata when available.
  • Added functions to compare log-fold-changes between sample groups to a reference and quantify the impact of development on differential expression analysis.
  • ref_compare() gets matching reference time points to the samples and compares logFCs between given sample groups, and between matching reference time points (giving an estimate of development logFCs between groups).
  • get_logFC() extracts sample and reference logFCs between specified groups from the output of ref_compare().
  • Renamed plot_cor.ae() to plot_cor().
  • Removed deprecated plotting function for df_CV
  • Updated vignette documentation:
  • Added a vignette on correcting DE analysis for development vignette("RAPToR-DEcorrection")
  • Updated vignette sections related to reference-building with the new objects.
  • Added a reference-building section and example for aging references
  • Re-formatted vignettes with Bioconductor style
  • Added PDF versions of all vignettes (index entries with -pdf, e.g. vignette("RAPToR-pdf"))

v1.1

v1.1.6 (used in Bulteau & Francesconi 2022 publication)

  • Revised the main vignette.
  • Updated README info.
  • Included biocViews in DESCRIPTION to automatically install bioconductor dependencies (thanks @helenmiller16).
  • Removed deprecated pls-dependent functions.
  • Fixed edge cases for 1-component reference building.

v1.1.5b

  • Added software, hardware, and runtime info to README.
  • Added installation instructions for dependencies.

v1.1.5

  • Added preprint citation to README and documentation.
  • Updated showcase vignettes relevant to the paper analyses

Prior updates can be found in the NEWS file.