-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
202 lines (107 loc) · 5.39 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# immunoeasy
<!-- badges: start -->
<!-- badges: end -->
The goal of immunoeasy is to make immunologist life easier with clear functions.
## Installation
You can install the released version of immunoeasy from [Github](https://github.com/itamuria/immunoeasy) with:
``` r
# library(devtools)
# devtools::install_github("itamuria/immunoeasy")
library(remotes)
remotes::install_github("itamuria/immunoeasy")
```
## Obtain information
If we want to know information about an ensemble id we can use the ens2symbol function. For example if we want to know the information about ENSG00000000003 we should do in the next way:
```{r example}
library(immunoeasy)
library(knitr)
library(dplyr)
library(kableExtra)
## basic example code
example_ensg <- ens2symbol(ens_ids = c("ENSG00000000003","ENSG00000184389"))
example_ensg %>%
kable() %>%
kable_styling()
```
### From vcf to different analysis
Variant callers create vcf files. This files have a defined structure and usually it is used for downstream analysis. Here we can do:
- From vcf to excel
- From vcf to potential neoantigen selection
```{r vcf2x}
```
### From counts to fpkm from htseq files
If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.
```{r count2fpkm_htseq}
repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/immunoeasy_counts.RData?raw=true")
names(count_list)
for(c in 1:length(count_list))
{
if(c == 4)
{
write.table(count_list[[c]], file = paste0(names(count_list)[c],".txt"),row.names = FALSE, sep = "\t")
} else if(c == 3)
{
write.table(count_list[[c]], file = paste0(names(count_list)[c],".txt"), row.names = TRUE)
}
else {
write.table(count_list[[c]], file = paste0(names(count_list)[c],".txt"),row.names = FALSE)
}
}
biomaRt::biomartCacheClear()
mfl_num_z <- 569
#htseq_fpkm <- counts2fpkm_htseq (filename = "htseq_counts.txt", mfl_num = c(mfl_num_z))
```
### From counts to fpkm from quant3 files
If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.
```{r count2fpkm_quant3}
quant_fpkm <- counts2fpkm_quant (filename = "quant_counts.txt", mfl_num = c(mfl_num_z))
```
### From counts to fpkm from subread files
If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.
```{r count2fpkm_subread}
subread_fpkm <- counts2fpkm_subread (filename = "subread_counts.txt", mfl_num = c(mfl_num_z))
```
### From counts to fpkm from cuff files
If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.
```{r count2fpkm_cuff, previous_clean = TRUE}
cuff_fpkm <- counts2fpkm_cuff (filename = "cufflink_fpkm.txt", previous_clean = TRUE)
save(htseq_fpkm, quant_fpkm, subread_fpkm,cuff_fpkm, file="fpkm.RData")
```
### From Variant caller to how many
```{r}
# selected_genes <- openxlsx::read.xlsx("data/Pac19_four_together4.xlsx")
# save(selected_genes, file = "Selected_genes.RData")
repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/Selected_genes.RData?raw=true")
```
### From variant callers to how many
In this case the function take an excel with several columns and count in how many variant callers are found the mutations. At least we need 4 columns: chromosome, position, gen_name and variant caller. Furthermore, we need to specify the names of the used variant callers. If we want to include more information as VAF and others we should include it.
```{r}
# Example_VariantCallers_PerMutation <- openxlsx::read.xlsx("Example_VariantCallers_PerMutation.xlsx")
# save(Example_VariantCallers_PerMutation, file = "Example_VariantCallers_PerMutation.RData")
repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/Example_VariantCallers_PerMutation.RData?raw=true")
openxlsx::write.xlsx(Example_VariantCallers_PerMutation, "Example_VariantCallers_PerMutation.xlsx")
Example_VariantCallers_PerMutation %>% kable() %>% kable_styling()
howmany <- varcall2HowMany (filename = "Example_VariantCallers_PerMutation.xlsx", chr_pos = 1, position = 2, gen_name = 3, varian_caller = 12, VAF = NA, others = NULL,var_cal_4 = c("mutect38","somaticsniper", "strelka","varscan"))
howmany2 <- varcall2HowMany (filename = "Example_VariantCallers_PerMutation.xlsx", chr_pos = 1, position = 2, gen_name = 3, varian_caller = 12, VAF = 5, others = c(4,6:11),var_cal_4 = c("mutect38","somaticsniper", "strelka","varscan"))
```
### Merge all fpkm together
```{r}
repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/fpkm.RData?raw=true")
final_dataframe <- four_counter2summary (semi_subread = subread_fpkm, semi_cuff = cuff_fpkm,
semi_quant = quant_fpkm, semi_htseq = htseq_fpkm,
ngenes = selected_genes$Symbol,
export_excel_name = "20200207_four_together_counts_pac5_10619.xlsx",
save_final = TRUE, dif_cuartiles = FALSE)
```