Praline-Cashmere is computes sequence alignments using Cashmere and MCL. This document describes how to install it on the DAS-5 and how to install it locally.
Since the DAS-5 has a Python implementation that is too old, we use the Conda
package manager to install a new Python. Download the bash
64-bits installer
from https://conda.io/miniconda.html and run it on the DAS-5. Accept the
license and choose a convenient path for the installation. We ignore the
warning for the PYTHONPATH
; in this case it is something that was set by the
CUDA package. We let the installer change the PATH in ~/.bashrc
and we have
to source this file to let the change have effect.
We need the following python packages:
conda install numpy
git clone praline.git
cd praline
git checkout cashmere
Then install Praline from the praline
directory:
python setup.py install
This repository contains the motif-aware part of praline.
git clone MA-PRALINE.git
cd MA-PRALINE
git checkout cashmere
Then we can run:
python setup.py install
Clone the repository https://github.com/ManyCore-NLeSC/whole-genome-tool
. This
contains the Python client code that preprocesses the sequences to annotate
them with motifs.
Clone this repository and run Gradle in the repository, which will create the installation directory:
./gradlew installDist
Export the following two variables:
export PRALINE_CASHMERE_DIR=/path/to/Praline-cashmere/build/install/praline-cashmere
export CASHMERE_PORT=<choose your own port>
Cashmere uses a server on the headnode to coordinate the compute nodes. In a
separate terminal, to start the server, go to the
directory $PRALINE_CASHMERE_DIR/bin
and run:
./cashmere-server
Then from the original terminal, run (again in $PRALINE_CASHMERE_DIR/bin) for two TitanX nodes:
./praline-cashmere TitanX=2
To run it with other node options, just run praline-cashmere
without arguments.
This will create a file called bowbeforeme
in $PRALINE_CASHMERE_DIR
which lists
the hostname of the server node, which is read by the python script.
In the whole-genome-tool
directory, run:
python wgt.py input/test.json
(or other file describing job instead of test.json)
To disable the cluster, set use_our_stuff
to False
in manager.py
.
The standard Python on the DAS-5 is version 2 whereas on a more up-to-date
machine, it is likely to be Python 3. Ultimately, we are going to use the
whole-genome-tool
library to drive the computation that is written in Python
2, so we will install all the Python packages with Python 2.
We need the following Python (2) packages:
numpy
requests
git clone praline.git
cd praline
git checkout cashmere
Then install Praline from the praline
directory:
python2 setup.py install --user
This repository contains the motif-aware part of praline.
git clone MA-PRALINE.git
cd MA-PRALINE
git checkout cashmere
Then we can run:
python2 setup.py install --user
Clone the repository https://github.com/ManyCore-NLeSC/whole-genome-tool
. This
contains the Python client code that preprocesses the sequences to annotate
them with motifs.
Clone this repository and run Gradle in the repository, which will create the installation directory:
./gradlew installDist
Export the following two variables:
export PRALINE_CASHMERE_DIR=/path/to/Praline-cashmere/build/install/praline-cashmere
export CASHMERE_PORT=<choose your own port>
Go to the directory $PRALINE_CASHMERE_DIR/bin
and run:
./praline-cashmere.local
This will create a file called bowbeforeme
in $PRALINE_CASHMERE_DIR
which lists
the hostname of the server node, which is read by the python script.
In the whole-genome-tool
directory, run:
python2 wgt.py input/test.json
(or other file describing job instead of test.json)