Skip to content
/ pyviper Public

Porting of Protein Activity and Pathway Inference to single cell and Python.

License

Notifications You must be signed in to change notification settings

alevax/pyviper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

image pyVIPER (VIPER Analysis in Python for single-cell RNASeq)

PyPI License: MIT Downloads

This package enables network-based protein activity estimation on Python. It provides also interfaces for scanpy (single-cell RNASeq analysis in Python). Functions are partly transplanted from R package viper and the R package NaRnEA.

The user-friendly documentation is available at: https://alevax.github.io/pyviper/index.html.

viper_visualized

Dependencies

  • scanpy for single cell pipeline
  • pandas and anndata for data computing and storage.
  • numpy and scipy for scientific computation.
  • joblib for parallel computing
  • tqdm show progress bar

If you are using a version of scanpy <1.9.3, it is also advisable to downgrade pandas to (>=1.3.0 & <2.0), due to scanpy incompatibility (issue)

Installation

pypi

pip install viper-in-python

local

git clone https://github.com/alevax/pyviper/
cd pyviper
pip install -e .

Usage

import pandas as pd
import anndata
import pyviper

# Load sample data
ges = anndata.read_text("test/unit_tests/test_1/test_1_inputs/LNCaPWT_gExpr_GES.tsv").T

# Load network
network = pyviper.load.msigdb_regulon("h")

# Translate sample data from ensembl to gene names
pyviper.pp.translate(ges, desired_format = "human_symbol")

## Filter targets in the interactome
network.filter_targets(ges.var_names)

# Compute regulon activities
## area
activity = pyviper.viper(gex_data=ges, interactome=network, enrichment="area")
print(activity.to_df())

## narnea
activity = pyviper.viper(gex_data=ges, interactome=network, enrichment="narnea", eset_filter=False)
print(activity.to_df())

Tutorials

  1. Analyzing scRNA-seq data at the Protein Activity Level
  2. Inferring Protein Activity from scRNA-seq data from multiple cell populations with the meta-VIPER approach
  3. Generating Metacells for ARACNe3 network generation and VIPER protein activity analysis

Structure and rationale

The main functions available from pyviper are:

  • pyviper.viper: "pyviper" function for Virtual Inference of Protein Activity by Enriched Regulon Analysis (VIPER). The function allows using 2 enrichment algorithms, aREA and (matrix)-NaRnEA (see below).
  • pyviper.aREA: computes aREA (analytic rank-based enrichment analysis) and meta-aREA
  • pyviper.NaRnEA: computes matrix-NaRnEA, a vectorized, implementation of NaRnEA
  • pyviper.pp.translate: for translating between species (i.e. mouse vs human) and between ensembl, entrez and gene symbols.
  • pyviper.tl.path_enr: computes pathway enrichment

Other notable functions include:

  • pyviper.tl.OncoMatch: computes OncoMatch, an algorithm to assess the activity conservation of MR proteins between two sets of samples (e.g. validate GEMMs as effective models of human samples)
  • pyviper.pp.stouffer: computes signatures on a cluster-by-cluster basis using Cluster integration method for pathway enrichment
  • pyviper.pp.viper_similarity: computes the similarity between VIPER signatures
  • pyviper.pp.repr_metacells: compute representative metacells (e.g. for ARACNe) using our method to maximize unique sample usage and minimize resampling (users can specify depth, percent data usage, etc).
  • pyviper.pp.repr_subsample: select a representative subsample of data using our method to ensure a widely distributed sampling.

Additionally, the following submodules are available:

  • pyviper.load: submodule containing several utility functions useful for different analyses, including load_msigdb_regulon, load_TFs etc
  • pyviper.pl: submodule containing pyviper-wrappers for scanpy plotting
  • pyviper.tl: submodule containing pyviper-wrappers for scanpy data transformation
  • pyviper.config: submodule allowing users to specify current species and filepaths for regulators

Last, a new Interactome class allows users to load and interrogate ARACNe- and SCENIC-inferred gene regulatory networks.

Contact

Please, report any issues that you experience through this repository "Issues".

For any other info or queries please write to Alessandro Vasciaveo ([email protected])

License

pyviper is distributed under a MIT License (see LICENSE).

Citation

If you used pyVIPER in your publication, please cite our work here:

Wang, A.L.E., Lin, Z., Zanella, L., Vlahos, L., Girotto, M.A., Zafar, A., ... & Vasciaveo, A. (2024). pyVIPER: A fast and scalable Python package for rank-based enrichment analysis of single-cell RNASeq data. bioRxiv, 2024-08. doi: https://doi.org/10.1101/2024.08.25.609585.

Manuscript in review

About

Porting of Protein Activity and Pathway Inference to single cell and Python.

Resources

License

Stars

Watchers

Forks

Packages

No packages published