-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
59 lines (44 loc) · 2.92 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(ZIMHMM)
library(SummarizedExperiment)
```
# ZIMHMM
ZIMHMM (Zero Inflated Mixed effects Hidden Markov Model) is a package with a peak caller to detect broad enrichment regions from multiple ChIP-seq experimental replicates. `ZIMHMM()` is a multi-sample peak caller that models the zero-inflation of background counts, accounts for replicate-specific differences via a mixed effects model, and ensures that broad regions of enrichment are detected by fitting a hidden Markov model. This package also contains `ZIHMM()`, a fixed effects version of the peak caller. `ZIHMM()` can be used in single- and multi-sample settings. The function `controlPeaks()` provides parameters that control the Expectation-Maximization (EM) algorithm from the presented peak callers. Please refer to their documentation (e.g. `?ZIMHMM::ZIMHMM`) for details. The paper for ZIMHMM was published in the Biometrics journal. [Click here](https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13083) to read it.
## Installation
You can install the released version of ZIMHMM from this repository with:
``` r
devtools::install_github("plbaldoni/ZIMHMM")
```
## Example
The package contains an example dataset with H3K36me3 ChIP-seq (and input control) read counts from chromosome 19 of three technical replicates of Huvec cell line (ENCODE Consortium). Reads were tabulated using 500bp non-overlapping windows. An example of ZIMHMM is shown below.
```{r example,eval=T,echo=T,message=F,warning=F,results='hide',fig.keep='all'}
# Loading example dataset
data(Huvec)
# ChIP and Control read counts, as well as the model offset.
ChIP = assay(Huvec,'ChIP')
Control = log(assay(Huvec,'Control')+1)
offset = matrix(0,nrow = nrow(ChIP),ncol = ncol(ChIP))
# Calling peaks
peakcall = ZIMHMM(ChIP = ChIP,Control = Control,offset = offset,random = 'intercept',control = controlPeaks())
```
## Post-processing
The current implementation of `ZIMHMM` contains two functions for data visualization. The first function is `plotPeaks`, which plots the ChIP-seq read counts, the called peaks from `ZIMHMM`, and the associated window-based posterior probabilities of enrichment. The input data should be formatted as a `RangedSummarizedExperiment` object (such as the example dataset Huvec).
```{r plots,eval=T,echo=T,message=F,warning=F,results='hide',fig.keep='all'}
# Plotting peak calls
plotPeaks(output = peakcall,chr = 'chr19',ranges = c(16898093,17040483),data = Huvec)
```
The second function is `makeBed`, which outputs a BED file that can be used for visualization elsewhere (such as the UCSC Genome Browser - https://genome.ucsc.edu/).
```{r bed,eval=F,echo=T}
# Exporting peak calls
makeBed(peakcall, data = Huvec,file = './HuvecH3K36me3.bed')
```