DNMTools is a set of tools for analyzing DNA methylation data from high-throughput sequencing experiments, especially whole genome bisulfite sequencing (WGBS), but also reduced representation bisulfite sequencing (RRBS). These tools focus on overcoming the computing challenges imposed by the scale of genome-wide DNA methylation data, which is usually the early parts of data analysis.
The documentation for DNMTools can be found here. But if you want to install from source and you are reading this on GitHub or in a source tree you unpacked, then keep reading. And if you are in a terminal, sorry for all the formatting.
- A recent compiler. Most users will be building and installing this software with GCC. We require a compiler that supports C++17, so we recommend using at least GCC 8 (released in 2018). There are still many systems that install a very old version of GCC by default, so if you have problems with building this software, that might be the first thing to check.
- The GNU Scientific Library. It can be installed using apt on Linux (Ubuntu, Debian), using brew on macOS, or from source available here.
- The HTSlib library. This can be installed through brew on macOS, through apt on Linux (Ubuntu, Debian), or from source downloadable here.
All the above can also be installed using conda. If you use conda for these dependencies, even if you are building dnmtools from the source repo, it is easiest if all dependencies are available through conda.
- Download dnmtools-1.4.2.tar.gz.
- Unpack the archive:
tar -zxvf dnmtools-1.4.2.tar.gz
- Move into the dnmtools directory and create a build directory:
cd dnmtools-1.4.2 && mkdir build && cd build
- Run the configuration script:
../configure
If you do not want to install DNMTools system-wide, or if you do not have admin privileges, specify a prefix directory:
../configure --prefix=/some/reasonable/place
If you installed HTSlib yourself in some non-standard directory, you must specify the location like this:
../configure CPPFLAGS='-I /path/to/htslib/headers' \
LDFLAGS='-L/path/to/htslib/lib'
Depending on how you obtained HTSlib, the headers may not be in a directory at the same depth as the library file.
If you are still in the build
directory, run make
to compile the
tools, and then make install
to install them:
make && make install
If your HTSlib (or some other library) is not installed system-wide, then you might need to udpate your library path:
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/path/to/htslib/lib
To test if everything was successful, simply run dnmtools
without
any arguments and you should see the list of available commands:
dnmtools
Not recommended, but if you want to do it this way, we assume you know
what you are doing. We strongly recommend using DNMTools through the
latest stable release under the releases section on GitHub. Developers
who wish to work on the latest commits, which are unstable, can
compile the source using a Makefile
left in the root of the source
tree. If HTSLib and other libraries are available system-wide,
compile by running:
make
This functionality will probably be removed soon, and if you want to build the code this way, you should know what you are doing any be able to make it work yourself.
Read the documentation for usage of individual tools within DNMTools.
The docker images of dnmtools
are accessible through GitHub Container
registry. These are light-weight (~30 MB) images that let you run dnmtools
without worrying about the dependencies.
To pull the image for the latest version, run:
docker pull ghcr.io/smithlabcode/dnmtools
To test the image installation, run:
docker run ghcr.io/smithlabcode/dnmtools
You should see the help page of dnmtools
.
For simpler reference, you can re-tag the installed image as follows, but note that you would have to re-tag the image whenever you pull an image for a new version.
docker tag ghcr.io/smithlabcode/dnmtools:latest dnmtools:latest
You can also install the image for a particular vertion by running
docker pull ghcr.io/smithlabcode/dnmtools:v[VERSION NUMBER] #(e.g. v1.4.2)
Not all versions have corresponding images; you can find available images here.
To run the image, you can run (assuming you tagged the image as above)
docker run -v /path/to/data:/data -w /data \
dnmtools [DNMTOOLS COMMAND] [OPTIONS] [ARGUMENTS]
In the above command, replace /path/to/data
with the path to the directory you
want to mount, and it will be mounted as the /data
directory in the container.
For example, if your genome data genome.fa
is located in ./genome_data
, you
can execute abismalidx
by running:
docker run -v ./genome_data:/data -w /data \
dnmtools abismalidx -v -t 4 genome.fa genome.idx
In the above command, -w /data
specifies the working directory in the
container, so the output genome.idx
is saved in the /data
directory,
which corresponds to the ./genome_data
directory in the host
machine. If you want to specify the output directory, use a command like below.
docker run -v ./genome_data:/data -w /data \
-v ./genome_index:/output \
dnmtools abismalidx -v -t 4 genome.fa /output/genome.idx
When you need to access multiple directories, it might be useful to use the
option -v ./:/app -w /app
, which mounts the current directory
to the /app
directory in the container, which is alo set as the working
directory. You can specify the paths in the same way you would from the
working directory in the host machine. For example:
docker run -v ./:/app -w /app \
dnmtools abismal -i genome_index/genome.idx -v -t 4 \
-o mapped_reads/output.sam \
reads/reads_1.fq reads/reads_1.fq
Run the following commands to test the installation and usage of the docker
image of dnmtools
.
docker pull ghcr.io/smithlabcode/dnmtools:latest
docker tag ghcr.io/smithlabcode/dnmtools:latest dnmtools:latest
# Clone the repo to access test data
git clone [email protected]:smithlabcode/dnmtools.git
cd dnmtools
# Run containers and save outputs in artifacts directory
mkdir artifacts
docker run -v ./:/app -w /app \
dnmtools abismalidx -v -t 1 data/tRex1.fa artifacts/tRex1.idx
docker run -v ./:/app -w /app \
dnmtools simreads -seed 1 -o artifacts/simreads -n 10000 \
-m 0.01 -b 0.98 data/tRex1.fa
docker run -v ./:/app -w /app \
dnmtools abismal -v -t 1 -i artifacts/tRex1.idx artifacts/simreads_{1,2}.fq
Andrew D. Smith [email protected]
Guilherme de Sena Brandine [email protected]
Copyright (C) 2022-2023 Andrew D. Smith and Guilherme de Sena Brandine
Authors of DNMTools: Andrew D. Smith and Guilherme de Sena Brandine
Essential contributors: Ben Decato, Meng Zhou, Liz Ji, Terence Li, Jenny Qu, Qiang Song and Fang Fang
This is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.