Skip to content

Commit

Permalink
Update protocol
Browse files Browse the repository at this point in the history
  • Loading branch information
Lael Barlow committed Jan 15, 2025
1 parent 06de6a5 commit d78baf6
Show file tree
Hide file tree
Showing 4 changed files with 78 additions and 11 deletions.
2 changes: 1 addition & 1 deletion config/example_genomes.csv
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,5 @@ Cryptococcus_neoformans.gff3, ,0,gzip,https://ftp.ncbi.nlm.nih.gov/genomes/all/G
Tremella_mesenterica.faa, ,0,gzip,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/271/645/GCF_000271645.1_Treme1/GCF_000271645.1_Treme1_protein.faa.gz,
Wallemia_ichthyophaga.faa, ,0,gzip,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/400/465/GCF_000400465.1_Wallemia_ichthyophaga_version_1.0/GCF_000400465.1_Wallemia_ichthyophaga_version_1.0_protein.faa.gz,
Rhodotorula_graminis.faa, ,0,gzip,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/329/695/GCF_001329695.1_Rhoba1_1/GCF_001329695.1_Rhoba1_1_protein.faa.gz,
Saccharomyces_cerevisiae.faa, ,0,gzip,ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/146/045/GCF_000146045.2_R64/GCF_000146045.2_R64_protein.faa.gz,
Saccharomyces_cerevisiae.faa, ,0,gzip,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/146/045/GCF_000146045.2_R64/GCF_000146045.2_R64_protein.faa.gz,
Rhizophagus_irregularis.faa, ,0,gzip,https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/439/145/GCF_000439145.1_ASM43914v3/GCF_000439145.1_ASM43914v3_protein.faa.gz,
55 changes: 55 additions & 0 deletions documentation/drac.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@

# How to install and run AMOEBAE on Digital Research Alliance of Canada (DRAC) clusters

**Note**: This is mostly relevant for AMOEBAE users in Canada. However, this approach can be easily adapted to any high performance computing cluster.

AMOEBAE depends on Conda packages, but installing Conda packages directly on DRAC isn't appropriate for the system. One way to circumvent that issue is to install dependencies in an Apptainer container, by performing the following steps.

1. Log on to a DRAC cluster login node, clone the amoebae repository, and navigate into the amoebae directory. Upload your sequence data, as needed using an appropriate method (see [DRAC documentation on data transfers](https://docs.alliancecan.ca/wiki/Transferring_data)). Also, I recommend doing the subsequent steps in a screen or tmux session so that your work won't be interrupted if you get disconnected (see [DRAC documentation on prolonging terminal sessions](https://docs.alliancecan.ca/wiki/Prolonging_terminal_sessions)).

2. Build the container using the `pixi.def` definition file provided in the AMOEBAE repository (see [DRAC Apptainer documentation](https://docs.alliancecan.ca/wiki/Apptainer)):
```bash
module load apptainer
apptainer build --disable-cache pixi.sif pixi.def
```
- This will take a few minutes.

3. To run AMOEBAE workflow steps requiring internet access (`download_queries`
and `download_dbs`), enter a shell session within the
container on the login node (unless using the Cedar cluster specifically):
- Enter the shell session:
```bash
APPTAINER_BIND=''
apptainer shell -C -B $PWD:/root --pwd /root pixi.sif
```
- Now you are in an environment where all the AMOEBAE dependencies are
installed. You can run `snakemake` commands as described in the [workflow
protocol](./workflow_protocol.md), without the need to prefix with `pixi run `.
- Run the specific rules in the workflow (if/when needed as part of the workflow protocol):
```bash
snakemake --cores 1 download_queries
snakemake --cores 1 download_dbs
```
- Exit the shell session:
```bash
exit
```

4. Otherwise, start an interactive session, so that your analysis will be run on a compute
node. For example, like this (see [DRAC documentation](https://docs.alliancecan.ca/wiki/Running_jobs#Interactive_jobs)):
```bash
salloc --time=1:0:0 --mem-per-cpu=16G --ntasks=1 --account=def-leppard
```

5. Then, enter a shell session within the container:
```bash
module load apptainer
APPTAINER_BIND=''
apptainer shell -C -B $PWD:/root --pwd /root pixi.sif
```

6. Now run the remaining workflow steps, as described in the workflow protocol.
```bash
snakemake get_ref_seqs
...
```
17 changes: 7 additions & 10 deletions documentation/workflow_protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,14 @@
## Requirements

The following setup procedure should work on most Linux or MacOS
computers. Please ensure that you have sufficient storage, as this protocol
systems. Please ensure that you have sufficient storage, as this protocol
will generate files totalling ~30GB or more in size.

Most users seem to run the AMOEBAE workflow either on their personal computers
or on servers without job schedulers (PBS, SLURM, etc.), so the instructions
below will not be directly applicable to running AMOEBAE on a high-performance
computing (HPC) cluster. If that is something you need to do, please refer to
the documentation on the [Snakemake
website](https://snakemake.readthedocs.io/en/stable/) and consult with your
system administrator(s) as necessary. Otherwise, just follow the installation
instructions below.


## Installation

### Local servers and personal computers

These instructions are for setting up and running AMOEBAE via the
[SnakeMake](https://snakemake.readthedocs.io/en/stable/) command-line
interface, which is well-documented and provides the flexibility to run AMOEBAE
Expand Down Expand Up @@ -75,6 +68,10 @@ installing dependencies on Apple Silicon MacOS systems.
the Conda environment created using Pixi, and no longer use the
`--use-conda` option. For example, `pixi run snakemake`.
### High Performance Computing Clusters
See the [instructions for using AMOEBAE on Digital Research Alliance of Canada clusters](./drac.md).
## Running the workflow
With example (default) input files, this workflow should take between 30 and 60
Expand Down
15 changes: 15 additions & 0 deletions pixi.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Bootstrap: docker
From: ghcr.io/prefix-dev/pixi:latest

%post
cd /root
cp pixi.toml pixi.lock /
cd /
pixi install --locked --environment default
pixi global install -c conda-forge -c bioconda blast muscle=3.8 exonerate

%environment
export PIXI_HOME=/
export PIXI_CACHE_DIR=/.cache/rattler/cache
export RATTLER_AUTH_FILE=/.rattler/credentials.json
cd / && source <(pixi shell-hook --environment default)

0 comments on commit d78baf6

Please sign in to comment.