Software to distinguish between common and experiment-specific gene expression signals
Alexandra J. Lee and Casey S. Greene 2022
University of Pennsylvania
This repository is named after the the character Sophie, from Hayao Miyazaki's animated film Howl's moving castle. In the film, Sophie’s appearance as an old woman, despite being a young woman that has been cursed, shows that initial observation can be misleading. Likewise SOPHIE allows users to identify specific gene expression signatures that can be masked by common background patterns.
SOPHIE was originally described and applied in the generic-expression-patterns repository.
Operating system: Linux
First you need to set up your local repository:
- Download and install github's large file tracker.
- Install miniconda
- Clone the
sophie
repository by running the following command in the terminal:
git clone https://github.com/greenelab/sophie.git
Note: Git automatically detects the LFS-tracked files and clones them via http.
- Navigate into cloned repo by running the following command in the terminal:
cd sophie
- Set up conda environment by running the following command in the terminal:
bash install.sh
- Navigate to any of the directories below to apply SOPHIE:
Name | Description |
---|---|
pre_model_seen_template | Here we use an existing trained VAE model and simulate a background dataset using a template experiment that is included in the training dataset (i.e. the datasets used to train the VAE model). |
pre_model_unseen_template | Here we use an existing trained VAE model and simulate a background dataset using a template experiment that is not included in the training dataset (i.e. the datasets used to train the VAE model). |
new_model_seen_template | Here we train a new VAE model. Then simulate a background dataset using a template experiment that is included in the training dataset (i.e. the datasets used to train the VAE model). |
new_model_unseen_template | Here we train a new VAE model. Then simulate a background dataset using a template experiment that is not included in the training dataset (i.e. the datasets used to train the VAE model). |