From 229736c4816c3db088e8a3d4dea00bbe6b469c5a Mon Sep 17 00:00:00 2001 From: Emma Rand Date: Tue, 24 Oct 2023 13:48:53 +0100 Subject: [PATCH 1/2] added frog heatmap to omics 3 workshop --- omics/week-5/Rplot001.jpg | Bin 0 -> 4227 bytes omics/week-5/workshop.qmd | 23 ++++++++++++++++++++--- 2 files changed, 20 insertions(+), 3 deletions(-) create mode 100644 omics/week-5/Rplot001.jpg diff --git a/omics/week-5/Rplot001.jpg b/omics/week-5/Rplot001.jpg new file mode 100644 index 0000000000000000000000000000000000000000..fd0b397d529171826b48cace63d93852a41d06bd GIT binary patch literal 4227 zcmex=^(PF6}rMnOeST|r4lSw=>~TvNxu(8R<HEAm;@P_1sVSzVUP#9la&z+7@&ZWiJ66!jh%y&iyNq5 zs{jKNBQrA-3o|P#3ky(nEl{3;MUYiU(a@1iI53f2sZhkIapFP_Wv7h?MT0JWP%%y_ zYU1P)6PJ*bQdLve(9|+9H8Z!cv~qTFb#wRd^a>6M4GWKmj7m;PO-s+n%qlJ^Ei136 ztZHs)ZENr7?3y%r%G7DoXUv?nXz`Mz%a*TLxoXqqEnBy3-?4Mop~FXx9y@;G&P778mFHFAhJO select(starts_with("log2_")) |> as.matrix() ``` +🎬 Set the rownames to the Xenbase gene symbols: ```{r} rownames(mat) <- s30_results_sig0.01$xenbase_gene_symbol ``` +You might want to view the matrix by clicking on it in the environment pane. + + +🎬 Load the **`heatmaply`** package: ```{r} library(heatmaply) ``` +We need to tell the clustering algorithm how many clusters to create. We will set the number of clusters for the treatments to be 2 and the number of clusters for the genes to be the same since it makes sense to see what clusters of genes correlate with the treatments. + +🎬 Set the number of clusters for the treatments and genes: + ```{r} n_treatment_clusters <- 2 n_gene_clusters <- 2 ``` + +🎬 Create the heatmap: ```{r} #| fig-height: 8 heatmaply(mat, scale = "row", - hide_colorbar = TRUE, k_col = n_treatment_clusters, k_row = n_gene_clusters, - label_names = c("Gene", "Sample", "Expression (normalised, log2)"), fontsize_row = 7, fontsize_col = 10, labCol = str_remove(colnames(mat), pattern = "log2_"), labRow = rownames(mat), heatmap_layers = theme(axis.line = element_blank())) ``` + +On the vertical axis are genes which are differentially expressed at the 0.01 level. On the horizontal axis are samples. We can see that the FGF-treated samples cluster together and the control samples cluster together. We can also see two clusters of genes; one of these shows genes upregulated (more yellow) in the FGF-treated samples (the pink cluster) and the other shows genes down regulated (more blue, the blue cluster) in the FGF-treated samples. + +The heatmap will open in the viewer pane (rather than the plot pane) because it is html. You can "Show in a new window" to see it in a larger format. You can also zoom in and out and pan around the heatmap and download it as a png. You might feel the colour bars is not adding much to the plot. You can remove it by setting `hide_colorbar = TRUE,` in the `heatmaply()` function. + ## Visualise all the results with a volcano plot colour the points if padj \< 0.05 and log2FoldChange \> 1 From 2cb0568e11766e522f918799206b18a4b99b805d Mon Sep 17 00:00:00 2001 From: Emma Rand Date: Tue, 24 Oct 2023 16:09:12 +0100 Subject: [PATCH 2/2] added the heatmap for the mouse data to omics 3 --- omics/week-5/workshop.qmd | 37 ++++++++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 11 deletions(-) diff --git a/omics/week-5/workshop.qmd b/omics/week-5/workshop.qmd index 28c8ba9..ed7d378 100644 --- a/omics/week-5/workshop.qmd +++ b/omics/week-5/workshop.qmd @@ -600,7 +600,7 @@ prog_hspc_results <- read_csv("results/prog_hspc_results.csv") ``` 🎬 Remind yourself what is in the rows and columns and the structure of -the dataframes (perhaps using `glimpse()`) +the dataframe (perhaps using `glimpse()`) ```{r} #| include: false @@ -855,12 +855,12 @@ ggsave("figures/prog_hspc-pca.png", ## Visualise the expression of the most significant genes using a heatmap -```{r} -library(heatmaply) -``` +A heatmap is a common way to visualise gene expression data. Often people will create heatmaps with thousands of genes but it can be more informative to use a subset along with clustering methods. We will use the genes which are significant at the 0.01 level. + +We are going to create an interactive heatmap with the **`heatmaply`** [@heatmaply] package. **`heatmaply`** takes a matrix as input so we need to convert a dataframe of the log~2~ values to a matrix. We will also set the rownames to the gene names. -we will use the most significant genes on a random subset of the cells -since \~1500 columns is a lot + +🎬 Convert a dataframe of the log~2~ values to a matrix. I have used `sample()` to select 70 random columns so the heatmap is generated quickly: ```{r} mat <- prog_hspc_results_sig0.01 |> @@ -869,32 +869,47 @@ mat <- prog_hspc_results_sig0.01 |> as.matrix() ``` + +🎬 Set the row names to the gene names: + ```{r} rownames(mat) <- prog_hspc_results_sig0.01$external_gene_name ``` +You might want to view the matrix by clicking on it in the environment pane. + +🎬 Load the **`heatmaply`** package: +```{r} +library(heatmaply) +``` + +We need to tell the clustering algorithm how many clusters to create. We will set the number of clusters for the cell types to be 2 and the number of clusters for the genes to be the same since it makes sense to see what clusters of genes correlate with the cell types. + ```{r} n_cell_clusters <- 2 n_gene_clusters <- 2 ``` + +🎬 Create the heatmap: + ```{r} heatmaply(mat, scale = "row", - hide_colorbar = TRUE, k_col = n_cell_clusters, k_row = n_gene_clusters, - label_names = c("Gene", "Cell id", "Expression (normalised, log2)"), fontsize_row = 7, fontsize_col = 10, labCol = colnames(mat), labRow = rownames(mat), heatmap_layers = theme(axis.line = element_blank())) ``` -will take a few mins to run, and longer to appear in the viewer -separation is not as strong as for the frog data run a few times to see -different subset +It will take a minute to run and display. On the vertical axis are genes which are differentially expressed at the 0.01 level. On the horizontal axis are cells. We can see that cells of the same type don't cluster that well together. We can also see two clusters of genes but the pattern of gene is not as clear as it was for the frogs and the correspondence with the cell clusters is not as strong. + +The heatmap will open in the viewer pane (rather than the plot pane) because it is html. You can "Show in a new window" to see it in a larger format. You can also zoom in and out and pan around the heatmap and download it as a png. You might feel the colour bars is not adding much to the plot. You can remove it by setting `hide_colorbar = TRUE,` in the `heatmaply()` function. + +Using all the cells is worth doing but it will take a while to generate the heatmap and then show in the viewer so do it sometime when you're ready for a coffee break. ## Visualise all the results with a volcano plot