README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# immunoeasy

<!-- badges: start -->
<!-- badges: end -->

The goal of immunoeasy is to make immunologist life easier with clear functions. 

## Installation

You can install the released version of immunoeasy from [Github](https://github.com/itamuria/immunoeasy) with:

``` r

# library(devtools)
# devtools::install_github("itamuria/immunoeasy")
library(remotes)
remotes::install_github("itamuria/immunoeasy")
```

## Obtain information

If we want to know information about an ensemble id we can use the ens2symbol function. For example if we want to know the information about ENSG00000000003 we should do in the next way:


```{r example}
library(immunoeasy)
library(knitr)
library(dplyr)
library(kableExtra)
## basic example code

example_ensg <- ens2symbol(ens_ids = c("ENSG00000000003","ENSG00000184389"))

example_ensg %>% 
    kable() %>%
    kable_styling()


```

### From vcf to different analysis

Variant callers create vcf files. This files have a defined structure and usually it is used for downstream analysis. Here we can do:

- From vcf to excel
- From vcf to potential neoantigen selection

```{r vcf2x}


```


### From counts to fpkm from htseq files

If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.

```{r count2fpkm_htseq}

repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/immunoeasy_counts.RData?raw=true")

names(count_list)

for(c in 1:length(count_list))
{
  if(c == 4) 
  {
    write.table(count_list[[c]], file = paste0(names(count_list)[c],".txt"),row.names = FALSE, sep = "\t")
  } else if(c == 3)  
    {
    write.table(count_list[[c]], file = paste0(names(count_list)[c],".txt"), row.names = TRUE)
    }
  else {
    write.table(count_list[[c]], file = paste0(names(count_list)[c],".txt"),row.names = FALSE)
  }
}

biomaRt::biomartCacheClear() 
mfl_num_z <- 569
#htseq_fpkm <- counts2fpkm_htseq (filename = "htseq_counts.txt", mfl_num = c(mfl_num_z))

```

### From counts to fpkm from quant3 files

If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.

```{r count2fpkm_quant3}


quant_fpkm <- counts2fpkm_quant (filename = "quant_counts.txt", mfl_num = c(mfl_num_z))


```

### From counts to fpkm from subread files

If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.

```{r count2fpkm_subread}

subread_fpkm <- counts2fpkm_subread (filename = "subread_counts.txt", mfl_num = c(mfl_num_z))


```


### From counts to fpkm from cuff files

If we have the htseq file and we want to obtain the fpmk and cuartiles. mfl_number is the average size of the reads. In this case it is 569.

```{r count2fpkm_cuff, previous_clean = TRUE}

cuff_fpkm <- counts2fpkm_cuff (filename = "cufflink_fpkm.txt", previous_clean = TRUE)

save(htseq_fpkm, quant_fpkm, subread_fpkm,cuff_fpkm, file="fpkm.RData")

```

### From Variant caller to how many

```{r}
# selected_genes <- openxlsx::read.xlsx("data/Pac19_four_together4.xlsx")
# save(selected_genes, file = "Selected_genes.RData")

repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/Selected_genes.RData?raw=true")


```
### From variant callers to how many

In this case the function take an excel with several columns and count in how many variant callers are found the mutations. At least we need 4 columns: chromosome, position, gen_name and variant caller. Furthermore, we need to specify the names of the used variant callers. If we want to include more information as VAF and others we should include it. 


```{r}

# Example_VariantCallers_PerMutation <- openxlsx::read.xlsx("Example_VariantCallers_PerMutation.xlsx")
# save(Example_VariantCallers_PerMutation, file = "Example_VariantCallers_PerMutation.RData")

repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/Example_VariantCallers_PerMutation.RData?raw=true")

openxlsx::write.xlsx(Example_VariantCallers_PerMutation, "Example_VariantCallers_PerMutation.xlsx")

Example_VariantCallers_PerMutation %>% kable() %>% kable_styling()

howmany <- varcall2HowMany (filename = "Example_VariantCallers_PerMutation.xlsx", chr_pos = 1, position = 2, gen_name = 3, varian_caller = 12, VAF = NA, others = NULL,var_cal_4 = c("mutect38","somaticsniper", "strelka","varscan"))

howmany2 <- varcall2HowMany (filename = "Example_VariantCallers_PerMutation.xlsx", chr_pos = 1, position = 2, gen_name = 3, varian_caller = 12, VAF = 5, others = c(4,6:11),var_cal_4 = c("mutect38","somaticsniper", "strelka","varscan"))  
  
```

### Merge all fpkm together

```{r}

repmis::source_data("https://github.com/itamuria/immunoeasy/blob/master/data/fpkm.RData?raw=true")

final_dataframe <- four_counter2summary (semi_subread = subread_fpkm, semi_cuff = cuff_fpkm,
                      semi_quant = quant_fpkm, semi_htseq = htseq_fpkm,
                                ngenes = selected_genes$Symbol,
                                export_excel_name = "20200207_four_together_counts_pac5_10619.xlsx",
                                save_final = TRUE, dif_cuartiles = FALSE)
```