Characterizing the extracellular matrix transcriptome of endometriosis

https://link.springer.com/article/10.1007/s43032-023-01359-w

Setup

Prerequisites

Jupyter Notebook
Python 3.7+
R 4.2+

1. Install dependencies (R and Python packages)

The following R packages are installed automatically by the script, install_r_packages.r:

affy
sva
readr
dplyr
Biobase
BiocGenerics
BiocParallel
genefilter
hgu133plus2cdf
jsonlite
org.Hs.eg.db
stringi
tibble
limma
yaml
ggrepel
devtools
IRkernel
clusterProfiler

Some of the listed R packages may require additional system dependencies.

If you have R set up to install packages system-wide (rather than to a personal user library), you can either run the install script as admin/superuser, or manually install the packages listed above (note that IRkernel is installed via devtools::install_github('IRkernel/IRkernel')).

Setup: Run the following commands at the command line:

git clone https://github.com/fogg-lab/characterizing-ecm-transcriptome-of-endometriosis.git
cd characterizing-ecm-transcriptome-of-endometriosis
pip install -r requirements.txt
Rscript install_r_packages.r

2. Prepare data for analysis

Run the Jupyter notebook, data_prep/prep.ipynb

Unsupervised analysis (hierarchical clustering)

Run the Jupyter notebook, analysis/clustering.ipynb

Condition stratification

Run the script:

cd analysis
python regression classifier.py

Compile condition stratification results and generate figures

Run the Jupyter notebook, analysis/get_classification_results.ipynb

Enrichment analysis

Run the Jupyter notebook, analysis/enrichment_analysis.ipynb

Differential expression analysis

Run the script, analysis/dgea.R

Usage

Rscript dgea.R <counts_filepath> <coldata_filepath> <config_filepath> [<filter_filepath>] <output_dir>

Example - Performing differential gene expression analysis with a filter list

In this example, we will run the dgea.R script with the following parameters:

counts_filepath: The file all_phases_all_genes_counts.tsv contains count data.
coldata_filepath: The file all_phases_coldata.tsv contains sample conditions, e.g. healthy/endometriosis.
config_filepath: The YAML configuration file dgea_config.yaml is used.
filter_filepath (optional argument): We are using the filter list core_matrisome_genes.json.
output_dir: The results will be written to the dgea_output directory.

The command would be as follows:

Rscript analysis/dgea.R data/all/all_phases_all_genes_counts.tsv data/all/all_phases_coldata.tsv analysis/dgea_config.yaml analysis/core_matrisome_genes.json dgea_output

Command-line arguments (listed in positional order) for dgea.R

-h or -help: Print usage information and exit.
counts_filepath: Path to the file containing count data.
coldata_filepath: Path to the file containing column data.
config_filepath: Path to the YAML file containing configuration settings.
filter_filepath: (Optional) Path to the JSON file containing gene filter list.
output_dir: Directory where the output file will be written.

Name		Name	Last commit message	Last commit date
Latest commit History 500 Commits
DEMGs		DEMGs
analysis		analysis
clustering_results		clustering_results
condition_stratification_results		condition_stratification_results
data_prep		data_prep
enrichment_analysis		enrichment_analysis
expression_mean_plots		expression_mean_plots
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
install_r_packages.r		install_r_packages.r
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Characterizing the extracellular matrix transcriptome of endometriosis

Setup

Prerequisites

1. Install dependencies (R and Python packages)

2. Prepare data for analysis

Unsupervised analysis (hierarchical clustering)

Condition stratification

Compile condition stratification results and generate figures

Enrichment analysis

Differential expression analysis

About

Releases

Packages

Languages

License

fogg-lab/characterizing-ecm-transcriptome-of-endometriosis

Folders and files

Latest commit

History

Repository files navigation

Characterizing the extracellular matrix transcriptome of endometriosis

Setup

Prerequisites

1. Install dependencies (R and Python packages)

2. Prepare data for analysis

Unsupervised analysis (hierarchical clustering)

Condition stratification

Compile condition stratification results and generate figures

Enrichment analysis

Differential expression analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages