Skip to content

Commit

Permalink
first half of slides added
Browse files Browse the repository at this point in the history
  • Loading branch information
3mmaRand committed Oct 23, 2023
1 parent ea01e25 commit e91cb72
Show file tree
Hide file tree
Showing 8 changed files with 1,229 additions and 26 deletions.
Binary file added _site/omics/week-5/images/Xenbase-Logo-Medium.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _site/omics/week-5/meta/xenbase_info.xlsx
Binary file not shown.
958 changes: 958 additions & 0 deletions _site/omics/week-5/study_before_workshop.html

Large diffs are not rendered by default.

161 changes: 161 additions & 0 deletions _site/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -1417,5 +1417,166 @@
"Week 4: Statistical Analysis",
"Prepare!"
]
},
{
"objectID": "omics/week-5/study_before_workshop.html#overview",
"href": "omics/week-5/study_before_workshop.html#overview",
"title": "Independent Study to prepare for workshop",
"section": "Overview",
"text": "Overview\nIn these slides we will:\n\n\nCheck where you are\n\nlearn some concepts used omics visualisation\n\nPrinciple Component Analysis (PCA)\nVolcano plots\nHeatmaps\n\n\nFind out what packages to install before the workshop"
},
{
"objectID": "omics/week-5/study_before_workshop.html#what-we-did-in-omics-2-statistical-analysis",
"href": "omics/week-5/study_before_workshop.html#what-we-did-in-omics-2-statistical-analysis",
"title": "Independent Study to prepare for workshop",
"section": "What we did in Omics 2: Statistical Analysis",
"text": "What we did in Omics 2: Statistical Analysis\n\n\ncarried out differential expression analysis\nfound genes not expressed at all, or expressed in one group only\nSaved results files"
},
{
"objectID": "omics/week-5/study_before_workshop.html#where-should-you-be-1",
"href": "omics/week-5/study_before_workshop.html#where-should-you-be-1",
"title": "Independent Study to prepare for workshop",
"section": "Where should you be?",
"text": "Where should you be?\nAfter the Omics 2: 👋 Statistical Analysis Workshop including:\n\n🤗 Look after future you! and\nthe Independent Study to consolidate, you should have:"
},
{
"objectID": "omics/week-5/study_before_workshop.html#frogs",
"href": "omics/week-5/study_before_workshop.html#frogs",
"title": "Independent Study to prepare for workshop",
"section": "🐸 Frogs",
"text": "🐸 Frogs\n\n\nAn RStudio Project called frogs-88H which contains:\n\nRaw data (S14, S20 and S30)\nProcessed data (s30_filtered.csv, s30_summary_gene.csv, s30_summary_gene_filtered.csv, s30_summary_samp.csv and equivalents for S14 OR S20)\nResults files (s30_fgf_only.csv, S30_normalised_counts.csv, S30_results.csv and equivalents for S14 OR S20)\n\nTwo scripts called cont-fgf-s30.R and either cont-fgf-s20.R OR cont-fgf-s14.R\n\n\n\n\n\nFiles should be organised into folders. Code should well commented and easy to read."
},
{
"objectID": "omics/week-5/study_before_workshop.html#mice",
"href": "omics/week-5/study_before_workshop.html#mice",
"title": "Independent Study to prepare for workshop",
"section": "🐭 Mice",
"text": "🐭 Mice\n\n\nAn RStudio Project called mice-88H which contains\n\nRaw data (hspc, prog, lthsc)\nProcessed data (hspc_summary_gene.csv, hspc_summary_samp.csv, prog_summary_gene.csv, prog_summary_samp.csv, lthsc_summary_gene.csv, lthsc_summary_samp.csv)\n\n\nResults files (prog_hspc_results.csv and an equivalent for lthsc vs prog or hspc vs lthsc)\nTwo scripts called hspc-prog.R and either hspc-lthsc.R OR prog-lthsc.R\n\n\n\nFiles should be organised into folders. Code should well commented and easy to read."
},
{
"objectID": "omics/week-5/study_before_workshop.html#section",
"href": "omics/week-5/study_before_workshop.html#section",
"title": "Independent Study to prepare for workshop",
"section": "🍂",
"text": "🍂\nEither of the other examples."
},
{
"objectID": "omics/week-5/study_before_workshop.html#if-you-do-not-have-those",
"href": "omics/week-5/study_before_workshop.html#if-you-do-not-have-those",
"title": "Independent Study to prepare for workshop",
"section": "If you do not have those",
"text": "If you do not have those\nGo through:\n\nOmics 2: Statistical Analysis including:\n🤗 Look after future you! and\nthe Independent Study to consolidate"
},
{
"objectID": "omics/week-5/study_before_workshop.html#examine-the-results-files-1",
"href": "omics/week-5/study_before_workshop.html#examine-the-results-files-1",
"title": "Independent Study to prepare for workshop",
"section": "Examine the results files",
"text": "Examine the results files\nRemind yourself of the key columns you have in the results files:\n\na fold change, logged to base 2\nan unadjusted p-value\na p value adjusted for multiple testing (FDR or padj)\na gene id"
},
{
"objectID": "omics/week-5/study_before_workshop.html#frogs-1",
"href": "omics/week-5/study_before_workshop.html#frogs-1",
"title": "Independent Study to prepare for workshop",
"section": "🐸 Frogs",
"text": "🐸 Frogs\n\n\nRows: 10,136\nColumns: 7\n$ baseMean <dbl> 237.553928, 531.565700, 86.392830, 49.813502, 419.9983…\n$ log2FoldChange <dbl> 0.096601855, -0.089588528, -0.192811203, -0.008858703,…\n$ lfcSE <dbl> 0.2079396, 0.1557384, 0.3253216, 0.4342614, 0.1685420,…\n$ stat <dbl> 0.46456683, -0.57525007, -0.59267874, -0.02039947, -0.…\n$ pvalue <dbl> 0.64224169, 0.56512218, 0.55339617, 0.98372471, 0.8699…\n$ padj <dbl> 0.9998970, 0.9998970, 0.9998970, 0.9998970, 0.9998970,…\n$ xenbase_gene_id <chr> \"XB-GENE-1000007\", \"XB-GENE-1000023\", \"XB-GENE-1000062…\n\n\n\n\nbaseMean is the mean of the normalised counts for the gene across all samples\n\nlfcSE standard error of the fold change\n\nstat is the test statistic (the Wald statistic)"
},
{
"objectID": "omics/week-5/study_before_workshop.html#mice-1",
"href": "omics/week-5/study_before_workshop.html#mice-1",
"title": "Independent Study to prepare for workshop",
"section": "🐭 Mice",
"text": "🐭 Mice\n\n\nRows: 280\nColumns: 6\n$ Top <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,…\n$ p.value <dbl> 7.038138e-117, 4.736622e-90, 1.832630e-88, 4.211954e-7…\n$ FDR <dbl> 1.970679e-114, 6.631271e-88, 1.710455e-86, 2.948368e-7…\n$ summary.logFC <dbl> 1.596910, 3.035165, 3.261056, -2.146491, -3.056730, 3.…\n$ logFC.hspc <dbl> 1.596910, 3.035165, 3.261056, -2.146491, -3.056730, 3.…\n$ ensembl_gene_id <chr> \"ENSMUSG00000028639\", \"ENSMUSG00000024053\", \"ENSMUSG00…\n\n\n\nTop is the rank of the gene ordered by the p value (smallest first)"
},
{
"objectID": "omics/week-5/study_before_workshop.html#from-xenbase",
"href": "omics/week-5/study_before_workshop.html#from-xenbase",
"title": "Independent Study to prepare for workshop",
"section": "from xenbase",
"text": "from xenbase\n\nxenbase logoXenbase (http://www.xenbase.org/, RRID:SCR_003280)\nXenbase is a model organism database that provides genomic, molecular, and developmental biology information about Xenopus laevis and Xenopus tropicalis. Xenbase is funded by the National Institutes of Health (NIH) and the National Science Foundation (NSF).\nour data gives the xenbase gene id so we are using xenbase to get the information a lot of the information would also be in the ncbi"
},
{
"objectID": "omics/week-5/study_before_workshop.html#from-the-ncbi",
"href": "omics/week-5/study_before_workshop.html#from-the-ncbi",
"title": "Independent Study to prepare for workshop",
"section": "from the ncbi",
"text": "from the ncbi\nbiomart is a package that allows you to get information from the ncbi database such as gene names and descriptions"
},
{
"objectID": "omics/week-5/study_before_workshop.html#plots-purpose",
"href": "omics/week-5/study_before_workshop.html#plots-purpose",
"title": "Independent Study to prepare for workshop",
"section": "plots purpose",
"text": "plots purpose\ndimsenion reduction"
},
{
"objectID": "omics/week-5/study_before_workshop.html#pca",
"href": "omics/week-5/study_before_workshop.html#pca",
"title": "Independent Study to prepare for workshop",
"section": "pca",
"text": "pca\nlots of variables"
},
{
"objectID": "omics/week-5/study_before_workshop.html#tsne",
"href": "omics/week-5/study_before_workshop.html#tsne",
"title": "Independent Study to prepare for workshop",
"section": "tsne",
"text": "tsne\nlots of variables and lots of observations"
},
{
"objectID": "omics/week-5/study_before_workshop.html#normalising-before-plotting",
"href": "omics/week-5/study_before_workshop.html#normalising-before-plotting",
"title": "Independent Study to prepare for workshop",
"section": "normalising before plotting",
"text": "normalising before plotting\nlog\nnormalisation regularised log is a method to bias from low count genes. https://hbctraining.github.io/DGE_workshop_salmon_online/lessons/03_DGE_QC_analysis.html\n\n\nT\n\n\nrlog is a method to bias from low count genes. https://hbctraining.github.io/DGE_workshop_salmon_online/lessons/03_DGE_QC_analysis.html gives a good explanation of regularized the log transform (rlog)\nThe rlog transformation of the normalized counts is only necessary for these visualization methods during this quality assessment. They are not used for DE because DESeq2 takes care of that\nin the workshop we just to log transformed\n\nThe 🐭 mouse data have been normalised to simplify the analysis for you; the 🐸 frog data have not but the DE method will do this for you."
},
{
"objectID": "omics/week-5/study_before_workshop.html#packages-to-install-before-the-workshop",
"href": "omics/week-5/study_before_workshop.html#packages-to-install-before-the-workshop",
"title": "Independent Study to prepare for workshop",
"section": "Packages to install before the workshop",
"text": "Packages to install before the workshop\nheatmaply ggrepel from CRAN in the the normal way:\n\ninstall.packages(\"heatmaply\")\ninstall.packages(\"ggrepel\")\n\nbiomaRt from Bioconductor using BiocManager:\n\nBiocManager::install(\"biomaRt\")"
},
{
"objectID": "omics/week-5/study_before_workshop.html#workshops-1",
"href": "omics/week-5/study_before_workshop.html#workshops-1",
"title": "Independent Study to prepare for workshop",
"section": "Workshops",
"text": "Workshops\n\nOmics 1: Hello data Getting to know the data. Checking the distributions of values\nOmics 2: Statistical Analysis Identifying which genes are differentially expressed between treatments.\nOmics 3: Visualising and Interpreting. PCA, Volcano plots and heatmaps to visualise results. Interpreting the results and finding out more about genes of interest."
},
{
"objectID": "omics/week-5/study_before_workshop.html#references",
"href": "omics/week-5/study_before_workshop.html#references",
"title": "Independent Study to prepare for workshop",
"section": "References",
"text": "References\n\n\n🔗 About Omics 3: Visualising and Interpreting"
},
{
"objectID": "omics/week-5/study_before_workshop.html#adding-gene-information-1",
"href": "omics/week-5/study_before_workshop.html#adding-gene-information-1",
"title": "Independent Study to prepare for workshop",
"section": "Adding gene information",
"text": "Adding gene information\n\n\nThe gene id is difficult to interpret in plots/tables\nTherefore we need to add information such as the gene name and a description to the results\nFor the 🐸 Frog data information comes from xenbase\nFor the 🐭 Mice data information comes from Ensembl"
},
{
"objectID": "omics/week-5/study_before_workshop.html#xenbase",
"href": "omics/week-5/study_before_workshop.html#xenbase",
"title": "Independent Study to prepare for workshop",
"section": "🐸 Xenbase",
"text": "🐸 Xenbase\n\nxenbase logoXenbase\nXenbase is a model organism database that provides genomic, molecular, and developmental biology information about Xenopus laevis and Xenopus tropicalis.\nIt took me some time to find the information you need."
},
{
"objectID": "omics/week-5/study_before_workshop.html#xenbase-1",
"href": "omics/week-5/study_before_workshop.html#xenbase-1",
"title": "Independent Study to prepare for workshop",
"section": "🐸 Xenbase",
"text": "🐸 Xenbase\n\n\nI got the information from the Xenbase information pages under Data Reports | Gene Information\nThis is listed: Xenbase Gene Product Information [readme] gzipped gpi (tab separated)\nClick on the readme link to see the file format and columns\nI downloaded xenbase.gpi.gz, unzipped it, removed header lines and the Xenopus tropicalis (taxon:8364) entries and saved it as xenbase_info.xlsx\nIn the workshop you will merge this information with the results file"
},
{
"objectID": "omics/week-5/study_before_workshop.html#ensembl",
"href": "omics/week-5/study_before_workshop.html#ensembl",
"title": "Independent Study to prepare for workshop",
"section": "🐭 Ensembl",
"text": "🐭 Ensembl\nfrom the ncbi\nbiomart is a package that allows you to get information from the ncbi database such as gene names and descriptions"
}
]
2 changes: 1 addition & 1 deletion _site/site_libs/quarto-html/quarto-html.min.css

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion omics/week-5/overview.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ toc: true
toc-location: right
---

This week we cover how to visualise and interpret the results of your differential expression analysis. The independent study will allow you to check you have what you should have following the [Omics 2: Statistical Analysis workshop](../week-4/workshop.html) and [Consolidation study](../week-4/study_after_workshop.html). It will also summarise the the methods and plots we will go through in the workshop. In the workshop, we will learn how to conduct a Principle Component Analysis (PCA) and plot the results as well as how to create a nicely formatted Volcano plot and heatmap. We will also consider three factors that help us choose an interesting/important gene: the absolute expression, the fold change and the adjusted p-value.
This week we cover how to visualise and interpret the results of your differential expression analysis. The independent study will allow you to check you have what you should have following the [Omics 2: Statistical Analysis workshop](../week-4/workshop.html) and [Consolidation study](../week-4/study_after_workshop.html). It will also summarise the the methods and plots we will go through in the workshop. In the workshop, we will learn how to conduct a Principle Component Analysis (PCA) and plot the results as well as how to create a nicely formatted Volcano plot and heatmap.

We suggest you sit together with your group in the workshop.

Expand Down
130 changes: 107 additions & 23 deletions omics/week-5/study_before_workshop.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,25 @@ editor:
wrap: 72
---

```{r}
#| include: false
library(tidyverse)
```


## Overview

In these slides we will:

::: incremental
- Check where you are

- learn some concepts
- learn some concepts used omics visualisation

-
- Principle Component Analysis (PCA)
- Volcano plots
- Heatmaps

- Find out what packages to install before the workshop
:::
Expand All @@ -34,13 +43,14 @@ In these slides we will:
## What we did in Omics 2: Statistical Analysis

::: incremental
::: {style="font-size: 90%;"}
-

-

- Saved files .
:::
- carried out differential expression analysis

- found genes not expressed at all, or expressed in one group only

- Saved results files

:::

## Where should you be?
Expand All @@ -56,13 +66,16 @@ Workshop](../week-4/workshop.html) including:

## 🐸 Frogs

::: {style="font-size: 90%;"}
::: {style="font-size: 70%;"}

- An RStudio Project called `frogs-88H` which contains:
- Raw data (S14, S20 and S30)
- Processed data (`s30_filtered.csv`, `s30_summary_gene.csv`,
`s30_summary_gene_filtered.csv`, `s30_summary_samp.csv` and
equivalents for S14 *OR* S20)
- Two scripts called `cont-fgf-s30.R` and `cont-fgf-s20.R` *OR*
- Results files (`s30_fgf_only.csv`, `S30_normalised_counts.csv`, `S30_results.csv` and
equivalents for S14 *OR* S20)
- Two scripts called `cont-fgf-s30.R` and either `cont-fgf-s20.R` *OR*
`cont-fgf-s14.R`
:::

Expand All @@ -71,12 +84,18 @@ easy to read.

## 🐭 Mice

::: {style="font-size: 70%;"}

- An RStudio Project called `mice-88H` which contains
- Raw data (hspc, prog, lthsc)
- Processed data (`hspc_summary_gene.csv`,
`hspc_summary_samp.csv`, `prog_summary_gene.csv`,
`prog_summary_samp.csv`)
- One script called `hspc-prog.R`
`prog_summary_samp.csv`, `lthsc_summary_gene.csv`,
`lthsc_summary_samp.csv`)
- Results files (`prog_hspc_results.csv` and an equivalent for lthsc vs prog or hspc vs lthsc)
- Two scripts called `hspc-prog.R` and either `hspc-lthsc.R` *OR*
`prog-lthsc.R`
:::

Files should be organised into folders. Code should well commented and
easy to read.
Expand All @@ -101,33 +120,98 @@ Go through:

## Examine the results files

Remind yourself of the key columns you have in the results files:

- a fold change, logged to base 2
- an unadjusted p-value
- a p value adjusted for multiple testing (`FDR` or `padj`)
- a gene id


## 🐸 Frogs

```{r}
#| echo: false
read_csv("results/s30_results.csv") |> glimpse()
```
- `baseMean` is the mean of the normalised counts for the gene across
all samples
- `lfcSE` standard error of the fold change
- `stat` is the test statistic (the Wald statistic)




## 🐭 Mice
```{r}
#| echo: false
read_csv("results/prog_hspc_results.csv") |> glimpse()
```

- Top is the rank of the gene ordered by the p value (smallest first)




# Adding gene information

## from xenbase
## Adding gene information

::: incremental

- The gene id is difficult to interpret in plots/tables

- Therefore we need to add information such as the gene name and a description to the results

- For the 🐸 Frog data information comes from xenbase

![xenbase logo](images/Xenbase-Logo-Medium.png){width="700"}
- For the 🐭 Mice data information comes from Ensembl

:::

## 🐸 Xenbase

Xenbase (http://www.xenbase.org/, RRID:SCR_003280)

Xenbase is a model organism database that provides genomic, molecular,
and developmental biology information about Xenopus laevis and Xenopus
tropicalis. Xenbase is funded by the National Institutes of Health
(NIH) and the National Science Foundation (NSF).
![xenbase logo](images/Xenbase-Logo-Medium.png){width="800"}

our data gives the xenbase gene id so we are using xenbase to get the information
a lot of the information would also be in the ncbi

## from the ncbi
[Xenbase](http://www.xenbase.org/) is a model organism database that provides genomic, molecular, and developmental biology information about *Xenopus laevis* and *Xenopus tropicalis*.

biomart is a package that allows you to get information from the ncbi
database such as gene names and descriptions
. . .

It took me some time to find the information you need.


## 🐸 Xenbase

::: incremental

- I got the information from the [Xenbase information pages](https://www.xenbase.org/xenbase/static-xenbase/ftpDatafiles.jsp) under Data Reports | Gene Information

- This is listed: Xenbase Gene Product Information [readme] [gzipped gpi (tab separated)](https://download.xenbase.org/xenbase/GenePageReports/xenbase.gpi.gz)

- Click on the readme link to see the file format and columns

- I downloaded [xenbase.gpi.gz](https://download.xenbase.org/xenbase/GenePageReports/xenbase.gpi.gz), unzipped it, removed header lines and the *Xenopus tropicalis* (taxon:8364) entries and saved it as [xenbase_info.xlsx](meta/xenbase_info.xlsx)

- In the workshop you will merge this information with the results file
:::

## 🐭 Ensembl

::: incremental

- [Ensembl](https://www.ensembl.org/index.html) creates, integrates and distributes reference datasets and analysis tools that enable genomics

- [BioMart](https://grch37.ensembl.org/info/data/biomart/index.html) provides a access to these large datasets

- **`biomaRt`** is a Bioconductor package gives you programmatic access to BioMart.

- In the workshop you use this package to get information you can merge with the results file
:::



Expand Down
2 changes: 1 addition & 1 deletion omics/week-5/workshop.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ If you click on the readme link you can see information telling you that the fil

🎬 ......

```{bash}
```bash
gunzip xenbase.gpi.gz
less xenbase.gpi
q
Expand Down

0 comments on commit e91cb72

Please sign in to comment.