-
Notifications
You must be signed in to change notification settings - Fork 8
Testing Your Installation
informME
is distributed with a comprehensive but small "toy model" intended for testing and debugging your local installation, and for familiarizing yourself with the tool. The reference genome consists of five chromosomes of length 10 kb each. The WGBS reads were simulated so that the resulting mean methylation level is known in advance and the cancer sample suffers from genome-wide hypo-methylation. Roughly, processing the entire toy example takes about 15 minutes. Once informME
has been installed through install.sh
, follow the steps described here to test our comprehensive toy example:
- If on a server that uses modules to load dependencies, load MATLAB and SAMtools:
module load matlab
module load samtools
- Reference Genome. Run the following:
cd informME/src/bash_src/parseBamFile/fastaToCpg/main
./main.sh
and then run ls -lthr
to check that a file CpGlocationChrX.mat
with a size of approximately 3.2K has been created for each of the five chromosomes:
total 76K
-rw-rw-r-- 1 usr usr 49K Apr 20 17:23 toy_genome.fa
-rwxrwxr-x 1 usr usr 1.1K Jun 5 15:43 main.sh
-rw-rw-r-- 1 usr usr 3.2K Jun 5 15:50 CpGlocationChr1.mat
-rw-rw-r-- 1 usr usr 3.2K Jun 5 15:50 CpGlocationChr2.mat
-rw-rw-r-- 1 usr usr 3.1K Jun 5 15:50 CpGlocationChr3.mat
-rw-rw-r-- 1 usr usr 3.1K Jun 5 15:50 CpGlocationChr4.mat
-rw-rw-r-- 1 usr usr 3.1K Jun 5 15:50 CpGlocationChr5.mat
- Generate input matrices by running the following:
cd informME/src/bash_src/parseBamFile/getMatrices/main
./main.sh
and then run ls -lthrR out/
to check that files toy_normal_pe_matrices.mat
and toy_cancer_pe_matrices.mat
with
a size of approximately 70K have been created for each of the five chromosomes:
out/:
total 20K
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:53 chr5
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:53 chr4
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:53 chr3
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:52 chr2
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:52 chr1
out/chr5:
total 144K
-rw-rw-r-- 1 usr usr 69K Jun 5 15:53 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr4:
total 136K
-rw-rw-r-- 1 usr usr 67K Jun 5 15:53 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 67K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr3:
total 144K
-rw-rw-r-- 1 usr usr 70K Jun 5 15:53 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 70K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr2:
total 144K
-rw-rw-r-- 1 usr usr 70K Jun 5 15:52 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr1:
total 144K
-rw-rw-r-- 1 usr usr 72K Jun 5 15:52 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 72K Jun 5 15:51 toy_normal_pe_matrices.mat
- Run informME using the following:
cd informME/src/bash_src/informME_run/main
./main.sh
and then run ls -lthrR out/
to check that analysis files for the normal, cancer, and pooled model with a size of approximately 68K have been created for each of the five chromosomes:
out/:
total 20K
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:01 chr5
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:01 chr4
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:00 chr3
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:00 chr2
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:59 chr1
out/chr5:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:01 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:59 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:56 toy_normal_analysis.mat
out/chr4:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:01 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:58 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:56 toy_normal_analysis.mat
out/chr3:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:00 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:57 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:55 toy_normal_analysis.mat
out/chr2:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:00 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:57 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:55 toy_normal_analysis.mat
out/chr1:
total 216K
-rw-rw-r-- 1 usr usr 69K Jun 5 15:59 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:56 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:54 toy_normal_analysis.mat
- Obtain bedGraph output for single analysis and check mean methylation level is approximately 0.8 for normal and 0.5 for cancer by looking at files
MML-toy_normal.bed
andMML-toy_cancer.bed
respectively:
cd informME/src/bash_src/analysis/singleAnalysis/singleMethAnalysisToBed/main
./main.sh
cat out/MML-toy_normal.bed | awk '{if(NR>1){total+=$4}}END{print total/NR}'
cat out/MML-toy_cancer.bed | awk '{if(NR>1){total+=$4}}END{print total/NR}'
also you should run ls -lthr out/
to see the following files of similiar file sizes:
total 236K
-rw-rw-r-- 1 usr usr 113 Jun 5 16:02 VAR-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 TURN-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 RDE-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 NME-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MSI-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MML-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 METH-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ESI-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ENTR-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 CAP-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.7K Jun 5 16:02 VAR-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.1K Jun 5 16:02 TURN-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 RDE-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 NME-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MSI-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MML-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 METH-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ESI-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ENTR-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 CAP-toy_cancer.bed
-rw-rw-r-- 1 usr usr 161 Jun 5 16:02 VAR-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 TURN-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 RDE-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 NME-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MSI-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MML-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 METH-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ESI-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ENTR-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 CAP-toy_pooled.bed
- Obtain bedGraph output for differential analysis and check mean JSD is approximately 0.68 by looking at file
JSD-toy_normal-VS-toy_cancer.bed
:
cd informME/src/bash_src/analysis/diffAnalysis/diffMethAnalysisToBed/main
./main.sh
cat out/JSD-toy_normal-VS-toy_cancer.bed | awk '{if(NR>1){total+=$4}}END{print total/NR}'
also you should run ls -lthr out/
to see the following files with similiar file sizes:
total 80K
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 JSD-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:05 dRDE-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.3K Jun 5 16:05 dNME-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 DMU-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:05 dMSI-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 dMML-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.3K Jun 5 16:05 DEU-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:05 dESI-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 dCAP-toy_normal-VS-toy_cancer.bed
This concludes the toy model included as part of the repository.
If you use informME, please cite:
[1] Jenkinson, G., Pujadas, E., Goutsias, J., and Feinberg, A.P. (2017), Potential energy landscapes identify the information-theoretic nature of the epigenome, Nature Genetics, 49: 719-729.
[2] Jenkinson, G., Abante, J., Feinberg, A.P., and Goutsias, J. (2018), An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data, BMC Bioinformatics, 19:87, https://doi.org/10.1186/s12859-018-2086-5.
[3] Jenkinson, G., Abante, J., Koldobskiy, M., Feinberg, A.P., and Goutsias, J. (2019), Ranking genomic features using an information-theoretic measure of epigenetic discordance, BMC Bioinformatics, 20:175, https://doi.org/10.1186/s12859-019-2777-6.
- Home
- Software Overview
- Dependencies
- Installing InformME
- Directory Structure
- Usage
- Reference Genome Analysis
- Methylation Data Matrix Generation
- Model Estimation & Analysis
- Generate BEDGRAPH Files for Single Analysis
- Generate BEDGRAPH Files for Differential Analysis
- Postprocessing: BEDGRAPH to BW Conversion
- Postprocessing: DMR Detection
- Postprocessing: Gene Ranking
- Testing/Debugging Your Install
- FAQs
- Version History
- Licencing