diff --git a/core/core.html b/core/core.html index a09a894..a66936a 100644 --- a/core/core.html +++ b/core/core.html @@ -350,7 +350,7 @@

Core Data Analysis

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-1/overview.html b/core/week-1/overview.html index 84047fd..d2bd87a 100644 --- a/core/week-1/overview.html +++ b/core/week-1/overview.html @@ -346,7 +346,7 @@

Overview

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-1/study_after_workshop.html b/core/week-1/study_after_workshop.html index ded915c..6827848 100644 --- a/core/week-1/study_after_workshop.html +++ b/core/week-1/study_after_workshop.html @@ -346,7 +346,7 @@

Independent Study to consolidate this week

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-1/study_before_workshop.html b/core/week-1/study_before_workshop.html index c14c6ff..4076be2 100644 --- a/core/week-1/study_before_workshop.html +++ b/core/week-1/study_before_workshop.html @@ -339,7 +339,7 @@

Independent Study to prepare for workshop

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-1/workshop.html b/core/week-1/workshop.html index 31bb733..cb32db9 100644 --- a/core/week-1/workshop.html +++ b/core/week-1/workshop.html @@ -400,7 +400,7 @@

Workshop

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-11/overview.html b/core/week-11/overview.html index e5cb601..d1805c0 100644 --- a/core/week-11/overview.html +++ b/core/week-11/overview.html @@ -346,7 +346,7 @@

Overview

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-11/study_after_workshop.html b/core/week-11/study_after_workshop.html index 5b47511..4525d86 100644 --- a/core/week-11/study_after_workshop.html +++ b/core/week-11/study_after_workshop.html @@ -339,7 +339,7 @@

Independent Study to consolidate this week

Published
-

28 March, 2024

+

29 March, 2024

diff --git a/core/week-11/study_before_workshop.html b/core/week-11/study_before_workshop.html index c93adb1..3a48ebd 100644 --- a/core/week-11/study_before_workshop.html +++ b/core/week-11/study_before_workshop.html @@ -388,7 +388,7 @@

Independent Study to prepare for workshop

-

28 March, 2024

+

29 March, 2024

Module assessment

diff --git a/core/week-11/workshop.html b/core/week-11/workshop.html index 862b2a3..4bcafbc 100644 --- a/core/week-11/workshop.html +++ b/core/week-11/workshop.html @@ -387,7 +387,7 @@

Workshop

Published
-

28 March, 2024

+

29 March, 2024

@@ -414,7 +414,7 @@

Workshop

  • how to insert special characters and equations
  • Exercise

    🎬 The example RStudio project containing this code here: chaffinch. You can download the project as a zip file from there but there is some code that will do that automatically for you. Since this is an RStudio Project, do not run the code from inside a project. You may want to navigate to a particular directory or edit the destdir:

    -
    usethis::use_course(url = "3mmaRand/chaffinch", destdir = ".")
    +
    usethis::use_course(url = "3mmaRand/chaffinch", destdir = ".")

    You can agree to deleting the zip. You should find RStudio restarts and you have a new project called chaffinch-xxxxxx. The xxxxxx is a commit reference - you do not need to worry about that, it is just a way to tell you which version of the repo you downloaded. You can now run the code in the project.

    🎬 Make an outline of your compendium. This could be a sketch on paper or slide or from the mindmap software you usually use. Or it could be a skeleton of folders and files on your computer.

    🎬 Make a start on a quarto doc.

    diff --git a/core/week-2/overview.html b/core/week-2/overview.html index 708a51e..6aa0309 100644 --- a/core/week-2/overview.html +++ b/core/week-2/overview.html @@ -337,7 +337,7 @@

    Overview

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/core/week-2/study_after_workshop.html b/core/week-2/study_after_workshop.html index fb73328..e55886d 100644 --- a/core/week-2/study_after_workshop.html +++ b/core/week-2/study_after_workshop.html @@ -337,7 +337,7 @@

    Independent Study to consolidate this week

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/core/week-2/study_before_workshop.html b/core/week-2/study_before_workshop.html index 1833937..ca2742c 100644 --- a/core/week-2/study_before_workshop.html +++ b/core/week-2/study_before_workshop.html @@ -448,7 +448,7 @@ -

    28 March, 2024

    +

    29 March, 2024

    Overview

    Using the usethis package

    Otherwise

    If you want the project directory elsewhere, you will need to give the relative path, e.g.

    -
    usethis::create_project("../Documents/bananas")
    +
    usethis::create_project("../Documents/bananas")

    Using the usethis package

    The output will look like this and a new RStudio session will start.

    @@ -639,7 +639,7 @@

    🎬 We can add a README with:

    -
    usethis::use_readme_md()
    +
    usethis::use_readme_md()
    diff --git a/core/week-2/workshop.html b/core/week-2/workshop.html index 7d255c1..d3448ea 100644 --- a/core/week-2/workshop.html +++ b/core/week-2/workshop.html @@ -414,7 +414,7 @@

    Workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    @@ -502,16 +502,16 @@

    Workshop

    ls -l
    total 128
    -drwxr-xr-x 2 runner docker  4096 Mar 28 17:29 data
    -drwxr-xr-x 2 runner docker  4096 Mar 28 17:29 images
    --rw-r--r-- 1 runner docker  1597 Mar 28 17:29 overview.qmd
    --rw-r--r-- 1 runner docker   184 Mar 28 17:29 study_after_workshop.qmd
    --rw-r--r-- 1 runner docker  4807 Mar 28 17:29 study_before_workshop.ipynb
    --rw-r--r-- 1 runner docker 13029 Mar 28 17:29 study_before_workshop.qmd
    --rw-r--r-- 1 runner docker 58063 Mar 28 17:29 workshop.html
    --rw-r--r-- 1 runner docker  8550 Mar 28 17:29 workshop.qmd
    --rw-r--r-- 1 runner docker  8564 Mar 28 17:32 workshop.rmarkdown
    -drwxr-xr-x 3 runner docker  4096 Mar 28 17:29 workshop_files
    +drwxr-xr-x 2 runner docker 4096 Mar 29 15:16 data +drwxr-xr-x 2 runner docker 4096 Mar 29 15:16 images +-rw-r--r-- 1 runner docker 1597 Mar 29 15:16 overview.qmd +-rw-r--r-- 1 runner docker 184 Mar 29 15:16 study_after_workshop.qmd +-rw-r--r-- 1 runner docker 4807 Mar 29 15:16 study_before_workshop.ipynb +-rw-r--r-- 1 runner docker 13029 Mar 29 15:16 study_before_workshop.qmd +-rw-r--r-- 1 runner docker 58063 Mar 29 15:16 workshop.html +-rw-r--r-- 1 runner docker 8550 Mar 29 15:16 workshop.qmd +-rw-r--r-- 1 runner docker 8564 Mar 29 15:18 workshop.rmarkdown +drwxr-xr-x 3 runner docker 4096 Mar 29 15:16 workshop_files

    You can use more than one option at once. The -h option stands for “human readable” and makes the file sizes easier to understand for humans:

    @@ -519,16 +519,16 @@

    Workshop

    ls -hl
    total 128K
    -drwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 data
    -drwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 images
    --rw-r--r-- 1 runner docker 1.6K Mar 28 17:29 overview.qmd
    --rw-r--r-- 1 runner docker  184 Mar 28 17:29 study_after_workshop.qmd
    --rw-r--r-- 1 runner docker 4.7K Mar 28 17:29 study_before_workshop.ipynb
    --rw-r--r-- 1 runner docker  13K Mar 28 17:29 study_before_workshop.qmd
    --rw-r--r-- 1 runner docker  57K Mar 28 17:29 workshop.html
    --rw-r--r-- 1 runner docker 8.4K Mar 28 17:29 workshop.qmd
    --rw-r--r-- 1 runner docker 8.4K Mar 28 17:32 workshop.rmarkdown
    -drwxr-xr-x 3 runner docker 4.0K Mar 28 17:29 workshop_files
    +drwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 data +drwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 images +-rw-r--r-- 1 runner docker 1.6K Mar 29 15:16 overview.qmd +-rw-r--r-- 1 runner docker 184 Mar 29 15:16 study_after_workshop.qmd +-rw-r--r-- 1 runner docker 4.7K Mar 29 15:16 study_before_workshop.ipynb +-rw-r--r-- 1 runner docker 13K Mar 29 15:16 study_before_workshop.qmd +-rw-r--r-- 1 runner docker 57K Mar 29 15:16 workshop.html +-rw-r--r-- 1 runner docker 8.4K Mar 29 15:16 workshop.qmd +-rw-r--r-- 1 runner docker 8.4K Mar 29 15:18 workshop.rmarkdown +drwxr-xr-x 3 runner docker 4.0K Mar 29 15:16 workshop_files

    The -a option stands for “all” and shows us all the files, including hidden files.

    @@ -536,18 +536,18 @@

    Workshop

    ls -alh
    total 136K
    -drwxr-xr-x 5 runner docker 4.0K Mar 28 17:32 .
    -drwxr-xr-x 6 runner docker 4.0K Mar 28 17:29 ..
    -drwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 data
    -drwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 images
    --rw-r--r-- 1 runner docker 1.6K Mar 28 17:29 overview.qmd
    --rw-r--r-- 1 runner docker  184 Mar 28 17:29 study_after_workshop.qmd
    --rw-r--r-- 1 runner docker 4.7K Mar 28 17:29 study_before_workshop.ipynb
    --rw-r--r-- 1 runner docker  13K Mar 28 17:29 study_before_workshop.qmd
    --rw-r--r-- 1 runner docker  57K Mar 28 17:29 workshop.html
    --rw-r--r-- 1 runner docker 8.4K Mar 28 17:29 workshop.qmd
    --rw-r--r-- 1 runner docker 8.4K Mar 28 17:32 workshop.rmarkdown
    -drwxr-xr-x 3 runner docker 4.0K Mar 28 17:29 workshop_files
    +drwxr-xr-x 5 runner docker 4.0K Mar 29 15:18 . +drwxr-xr-x 6 runner docker 4.0K Mar 29 15:16 .. +drwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 data +drwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 images +-rw-r--r-- 1 runner docker 1.6K Mar 29 15:16 overview.qmd +-rw-r--r-- 1 runner docker 184 Mar 29 15:16 study_after_workshop.qmd +-rw-r--r-- 1 runner docker 4.7K Mar 29 15:16 study_before_workshop.ipynb +-rw-r--r-- 1 runner docker 13K Mar 29 15:16 study_before_workshop.qmd +-rw-r--r-- 1 runner docker 57K Mar 29 15:16 workshop.html +-rw-r--r-- 1 runner docker 8.4K Mar 29 15:16 workshop.qmd +-rw-r--r-- 1 runner docker 8.4K Mar 29 15:18 workshop.rmarkdown +drwxr-xr-x 3 runner docker 4.0K Mar 29 15:16 workshop_files

    You can move about with the cd command, which stands for “change directory”. You can use it to move into a directory by specifying the path to the directory:

    diff --git a/core/week-6/overview.html b/core/week-6/overview.html index 57919bc..ddd48e4 100644 --- a/core/week-6/overview.html +++ b/core/week-6/overview.html @@ -345,7 +345,7 @@

    Overview

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/core/week-6/study_after_workshop.html b/core/week-6/study_after_workshop.html index 6432afe..03b40c6 100644 --- a/core/week-6/study_after_workshop.html +++ b/core/week-6/study_after_workshop.html @@ -339,7 +339,7 @@

    Independent Study to consolidate this week

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/core/week-6/study_before_workshop.html b/core/week-6/study_before_workshop.html index 5a74746..a18da86 100644 --- a/core/week-6/study_before_workshop.html +++ b/core/week-6/study_before_workshop.html @@ -339,7 +339,7 @@

    Independent Study to prepare for workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/core/week-6/workshop.html b/core/week-6/workshop.html index 8e92fe4..1cde31b 100644 --- a/core/week-6/workshop.html +++ b/core/week-6/workshop.html @@ -375,7 +375,7 @@

    Workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/images/images.html b/images/images.html index bdb6150..6456b51 100644 --- a/images/images.html +++ b/images/images.html @@ -218,7 +218,7 @@

    Image Data Analysis for Group Project

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    @@ -332,7 +332,7 @@

    Image Data Analysis for Group Project

    You now have a dataframe with all the tracking data which is relatively easy to summarise and plot using tools you know.

    There is an example RStudio project containing this code here: tips. You can also download the project as a zip file from there but there is some code that will do that automatically for you. Since this is an RStudio Project, do not run the code from inside a project. You may want to navigate to a particular directory or edit the destdir:

    -
    usethis::use_course(url = "3mmaRand/tips", destdir = ".")
    +
    usethis::use_course(url = "3mmaRand/tips", destdir = ".")

    You can agree to deleting the zip. You should find RStudio restarts and you have a new project called tips-xxxxxx. The xxxxxx is a commit reference - you do not need to worry about that, it is just a way to tell you which version of the repo you downloaded. You can now run the code in the project.

    diff --git a/index.html b/index.html index 9a0ad63..6f86eb2 100644 --- a/index.html +++ b/index.html @@ -181,7 +181,7 @@

    Data Analysis for the Group Research Project

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/kelly/workshop.html b/omics/kelly/workshop.html index db388b5..089166b 100644 --- a/omics/kelly/workshop.html +++ b/omics/kelly/workshop.html @@ -382,7 +382,7 @@

    Workflow for VFA analysis

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    @@ -505,11 +505,16 @@

    Workflow for VFA analysis

    View vfa_delta to check it looks like vfa_cummul

    🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):

    +
    +
    vfa_delta <- vfa_delta |> 
    +  left_join(mol_wt, by = "vfa") |>
    +  mutate(conc_g_l = conc_mM * 0.001 * mw)
    +

    3. Calculate the percent representation of each VFA

    by mM and by weight

    🎬 Add a column which is the percent representation of each VFA for mM and g/l:

    -
    vfa_cummul <- vfa_cummul |> 
    +
    vfa_cummul <- vfa_cummul |> 
       group_by(sample_replicate, time_day) |> 
       mutate(percent_conc_g_l = conc_g_l / sum(conc_g_l) * 100,
              percent_conc_mM = conc_mM / sum(conc_mM) * 100)
    @@ -517,7 +522,7 @@

    Workflow for VFA analysis

    Graphs for info so far

    🎬 Make summary data for graphing

    -
    vfa_cummul_summary <- vfa_cummul |> 
    +
    vfa_cummul_summary <- vfa_cummul |> 
       group_by(treatment, time_day, vfa) |> 
       summarise(mean_g_l = mean(conc_g_l),
                 se_g_l = sd(conc_g_l)/sqrt(length(conc_g_l)),
    @@ -526,7 +531,7 @@ 

    Workflow for VFA analysis

    ungroup()
    -
    vfa_delta_summary <- vfa_delta |> 
    +
    vfa_delta_summary <- vfa_delta |> 
       group_by(treatment, time_day, vfa) |> 
       summarise(mean_g_l = mean(conc_g_l),
                 se_g_l = sd(conc_g_l)/sqrt(length(conc_g_l)),
    @@ -536,7 +541,7 @@ 

    Workflow for VFA analysis

    🎬 Graph the cumulative data, grams per litre:

    -
    vfa_cummul_summary |> 
    +
    vfa_cummul_summary |> 
       ggplot(aes(x = time_day, colour = vfa)) +
       geom_line(aes(y = mean_g_l), 
                 linewidth = 1) +
    @@ -560,7 +565,7 @@ 

    Workflow for VFA analysis

    🎬 Graph the change data, grams per litre:

    -
    vfa_delta_summary |> 
    +
    vfa_delta_summary |> 
       ggplot(aes(x = time_day, colour = vfa)) +
       geom_line(aes(y = mean_g_l), 
                 linewidth = 1) +
    @@ -584,7 +589,7 @@ 

    Workflow for VFA analysis

    🎬 Graph the mean percent representation of each VFA g/l. Note geom_col() will plot proportion if we setposition = "fill"

    -
    vfa_cummul_summary |> 
    +
    vfa_cummul_summary |> 
       ggplot(aes(x = time_day, y = mean_g_l, fill = vfa)) +
       geom_col(position = "fill") +
       scale_fill_viridis_d(name = NULL) +
    @@ -605,7 +610,7 @@ 

    Workflow for VFA analysis

    However, PCA expects a matrix with samples in rows and VFA, the variables, in columns. We will need to select the columns we need and pivot wider. Then convert to a matrix.

    🎬

    -
    vfa_cummul_pca <- vfa_cummul |> 
    +
    vfa_cummul_pca <- vfa_cummul |> 
       select(sample_replicate, 
              treatment, 
              replicate, 
    @@ -616,7 +621,7 @@ 

    Workflow for VFA analysis

    values_from = conc_g_l)
    -
    mat <- vfa_cummul_pca |> 
    +
    mat <- vfa_cummul_pca |> 
       ungroup() |>
       select(-sample_replicate, 
              -treatment, 
    @@ -626,13 +631,13 @@ 

    Workflow for VFA analysis

    🎬 Perform PCA on the matrix:

    -
    pca <- mat |>
    +
    pca <- mat |>
       prcomp(scale. = TRUE, 
              rank. = 4) 

    The scale. argument tells prcomp() to scale the data to have a mean of 0 and a standard deviation of 1. The rank. argument tells prcomp() to only calculate the first 4 principal components. This is useful for visualisation as we can only plot in 2 or 3 dimensions. We can see the results of the PCA by viewing the summary() of the pca object.

    -
    summary(pca)
    +
    summary(pca)
    Importance of first k=4 (out of 8) components:
                               PC1    PC2     PC3     PC4
    @@ -644,7 +649,7 @@ 

    Workflow for VFA analysis

    The Proportion of Variance tells us how much of the variance is explained by each component. We can see that the first component explains 0.7798 of the variance, the second 0.1018, and the third 0.07597. Together the first three components explain nearly 96% of the total variance in the data. Plotting PC1 against PC2 will capture about 78% of the variance which is likely much better than we would get plotting any two VFA against each other. To plot the PC1 against PC2 we will need to extract the PC1 and PC2 score from the pca object and add labels for the samples.

    🎬 Create a dataframe of the PC1 and PC2 scores which are in pca$x and add the sample information from vfa_cummul_pca:

    -
    pca_labelled <- data.frame(pca$x,
    +
    pca_labelled <- data.frame(pca$x,
                                sample_replicate = vfa_cummul_pca$sample_replicate,
                                treatment = vfa_cummul_pca$treatment,
                                replicate = vfa_cummul_pca$replicate,
    @@ -1281,7 +1286,7 @@ 

    Workflow for VFA analysis

    🎬 Plot PC1 against PC2 and colour by time and shape by treatment:

    -
    pca_labelled |> 
    +
    pca_labelled |> 
       ggplot(aes(x = PC1, y = PC2, 
                  colour = factor(time_day),
                  shape = treatment)) +
    @@ -1300,7 +1305,7 @@ 

    Workflow for VFA analysis

    🎬 Plot PC1 against PC2 and colour by time and facet treatment:

    -
    pca_labelled |> 
    +
    pca_labelled |> 
       ggplot(aes(x = PC1, y = PC2, colour = factor(time_day))) +
       geom_point(size = 3) +
       scale_colour_viridis_d(end = 0.95, begin = 0.15,
    @@ -1319,23 +1324,23 @@ 

    Workflow for VFA analysis

    We are going to create an interactive heatmap with the heatmaply (Galili et al. 2017) package. heatmaply takes a matrix as input so we can use mat

    🎬 Set the rownames to the sample id whihcih is combination of sample_replicate and time_day:

    -
    rownames(mat) <- interaction(vfa_cummul_pca$sample_replicate, 
    +
    rownames(mat) <- interaction(vfa_cummul_pca$sample_replicate, 
                                  vfa_cummul_pca$time_day)

    You might want to view the matrix by clicking on it in the environment pane.

    🎬 Load the heatmaply package:

    We need to tell the clustering algorithm how many clusters to create. We will set the number of clusters for the treatments to be 2 and the number of clusters for the vfa to be the same since it makes sense to see what clusters of genes correlate with the treatments.

    🎬 Set the number of clusters for the treatments and vfa:

    -
    n_treatment_clusters <- 2
    +
    n_treatment_clusters <- 2
     n_vfa_clusters <- 2

    🎬 Create the heatmap:

    -
    heatmaply(mat, 
    +
    heatmaply(mat, 
               scale = "column",
               k_col = n_vfa_clusters,
               k_row = n_treatment_clusters,
    @@ -1344,8 +1349,8 @@ 

    Workflow for VFA analysis

    labRow = rownames(mat), heatmap_layers = theme(axis.line = element_blank()))
    -
    - +
    +

    The heatmap will open in the viewer pane (rather than the plot pane) because it is html. You can “Show in a new window” to see it in a larger format. You can also zoom in and out and pan around the heatmap and download it as a png. You might feel the colour bars is not adding much to the plot. You can remove it by setting hide_colorbar = TRUE, in the heatmaply() function.

    diff --git a/omics/omics.html b/omics/omics.html index 5547cdc..7adcce7 100644 --- a/omics/omics.html +++ b/omics/omics.html @@ -351,7 +351,7 @@

    Omics Data Analysis for Group Project

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/semester-2/workshop.html b/omics/semester-2/workshop.html index 667050d..1f471da 100644 --- a/omics/semester-2/workshop.html +++ b/omics/semester-2/workshop.html @@ -226,7 +226,7 @@

    Workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-3/overview.html b/omics/week-3/overview.html index 99f2128..0404729 100644 --- a/omics/week-3/overview.html +++ b/omics/week-3/overview.html @@ -328,7 +328,7 @@

    Overview

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-3/study_after_workshop.html b/omics/week-3/study_after_workshop.html index 50a7a38..0588f96 100644 --- a/omics/week-3/study_after_workshop.html +++ b/omics/week-3/study_after_workshop.html @@ -321,7 +321,7 @@

    Independent Study to consolidate this week

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-3/study_before_workshop.html b/omics/week-3/study_before_workshop.html index e15cb16..9d87f53 100644 --- a/omics/week-3/study_before_workshop.html +++ b/omics/week-3/study_before_workshop.html @@ -388,7 +388,7 @@

    Independent Study to prepare for workshop

    -

    28 March, 2024

    +

    29 March, 2024

    Overview

    diff --git a/omics/week-3/workshop.html b/omics/week-3/workshop.html index c3ce1b9..83288e0 100644 --- a/omics/week-3/workshop.html +++ b/omics/week-3/workshop.html @@ -416,7 +416,7 @@

    Workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-4/overview.html b/omics/week-4/overview.html index 9e73544..43475d6 100644 --- a/omics/week-4/overview.html +++ b/omics/week-4/overview.html @@ -349,7 +349,7 @@

    Overview

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-4/study_after_workshop.html b/omics/week-4/study_after_workshop.html index 8495598..dc9ee3d 100644 --- a/omics/week-4/study_after_workshop.html +++ b/omics/week-4/study_after_workshop.html @@ -321,7 +321,7 @@

    Independent Study to consolidate this week

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-4/study_before_workshop.html b/omics/week-4/study_before_workshop.html index 629d6e4..7daf985 100644 --- a/omics/week-4/study_before_workshop.html +++ b/omics/week-4/study_before_workshop.html @@ -446,7 +446,7 @@ -

    28 March, 2024

    +

    29 March, 2024

    Overview

    In these slides we will:

    diff --git a/omics/week-4/workshop.html b/omics/week-4/workshop.html index e17fd12..210679e 100644 --- a/omics/week-4/workshop.html +++ b/omics/week-4/workshop.html @@ -389,7 +389,7 @@

    Workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-5/figures/prog-hspc-volcano.png b/omics/week-5/figures/prog-hspc-volcano.png index 43a9d01..76fea74 100644 Binary files a/omics/week-5/figures/prog-hspc-volcano.png and b/omics/week-5/figures/prog-hspc-volcano.png differ diff --git a/omics/week-5/overview.html b/omics/week-5/overview.html index eb7bdaf..fe78ad1 100644 --- a/omics/week-5/overview.html +++ b/omics/week-5/overview.html @@ -329,7 +329,7 @@

    Overview

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-5/study_after_workshop.html b/omics/week-5/study_after_workshop.html index 5afb330..54ad694 100644 --- a/omics/week-5/study_after_workshop.html +++ b/omics/week-5/study_after_workshop.html @@ -321,7 +321,7 @@

    Independent Study to consolidate this week

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    diff --git a/omics/week-5/study_before_workshop.html b/omics/week-5/study_before_workshop.html index d25dc2f..6d4b81a 100644 --- a/omics/week-5/study_before_workshop.html +++ b/omics/week-5/study_before_workshop.html @@ -446,7 +446,7 @@
    -

    28 March, 2024

    +

    29 March, 2024

    Overview

    In these slides we will:

    diff --git a/omics/week-5/workshop.html b/omics/week-5/workshop.html index 0de381b..484eaa9 100644 --- a/omics/week-5/workshop.html +++ b/omics/week-5/workshop.html @@ -394,7 +394,7 @@

    Workshop

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    @@ -739,8 +739,8 @@

    Workshop

    labRow = rownames(mat), heatmap_layers = theme(axis.line = element_blank()))
    -
    - +
    +

    On the vertical axis are genes which are differentially expressed at the 0.01 level. On the horizontal axis are samples. We can see that the FGF-treated samples cluster together and the control samples cluster together. We can also see two clusters of genes; one of these shows genes upregulated (more yellow) in the FGF-treated samples (the pink cluster) and the other shows genes down regulated (more blue, the blue cluster) in the FGF-treated samples.

    @@ -1060,8 +1060,8 @@

    Workshop

    labRow = rownames(mat), heatmap_layers = theme(axis.line = element_blank()))
    -
    - +
    +

    It will take a minute to run and display. On the vertical axis are genes which are differentially expressed at the 0.01 level. On the horizontal axis are cells. We can see that cells of the same type don’t cluster that well together. We can also see two clusters of genes but the pattern of gene is not as clear as it was for the frogs and the correspondence with the cell clusters is not as strong.

    diff --git a/omics/week-5/workshop_files/figure-html/unnamed-chunk-33-1.png b/omics/week-5/workshop_files/figure-html/unnamed-chunk-33-1.png index c5cc85a..ffe6f0f 100644 Binary files a/omics/week-5/workshop_files/figure-html/unnamed-chunk-33-1.png and b/omics/week-5/workshop_files/figure-html/unnamed-chunk-33-1.png differ diff --git a/omics/week-5/workshop_files/figure-html/unnamed-chunk-65-1.png b/omics/week-5/workshop_files/figure-html/unnamed-chunk-65-1.png index 01878d4..2aadb57 100644 Binary files a/omics/week-5/workshop_files/figure-html/unnamed-chunk-65-1.png and b/omics/week-5/workshop_files/figure-html/unnamed-chunk-65-1.png differ diff --git a/search.json b/search.json index c081a37..304444a 100644 --- a/search.json +++ b/search.json @@ -2016,7 +2016,7 @@ "href": "core/week-2/workshop.html#rstudio-terminal", "title": "Workshop", "section": "RStudio terminal", - "text": "RStudio terminal\nThe RStudio terminal is a convenient interface to the shell without leaving RStudio. It is useful for running commands that are not available in R. For example, you can use it to run other programs like fasqc, git, ftp, ssh\nNavigating your file system\nSeveral commands are frequently used to create, inspect, rename, and delete files and directories.\n$\nThe dollar sign is the prompt (like > on the R console), which shows us that the shell is waiting for input.\nYou can find out where you are using the pwd command, which stands for “print working directory”.\n\npwd\n\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2\n\n\nYou can find out what you can see with ls which stands for “list”.\n\nls\n\ndata\nimages\noverview.qmd\nstudy_after_workshop.qmd\nstudy_before_workshop.ipynb\nstudy_before_workshop.qmd\nworkshop.html\nworkshop.qmd\nworkshop.rmarkdown\nworkshop_files\n\n\nYou might have noticed that unlike R, the commands do not have brackets after them. Instead, options (or switches) are given after the command. For example, we can modify the ls command to give us more information with the -l option, which stands for “long”.\n\nls -l\n\ntotal 128\ndrwxr-xr-x 2 runner docker 4096 Mar 28 17:29 data\ndrwxr-xr-x 2 runner docker 4096 Mar 28 17:29 images\n-rw-r--r-- 1 runner docker 1597 Mar 28 17:29 overview.qmd\n-rw-r--r-- 1 runner docker 184 Mar 28 17:29 study_after_workshop.qmd\n-rw-r--r-- 1 runner docker 4807 Mar 28 17:29 study_before_workshop.ipynb\n-rw-r--r-- 1 runner docker 13029 Mar 28 17:29 study_before_workshop.qmd\n-rw-r--r-- 1 runner docker 58063 Mar 28 17:29 workshop.html\n-rw-r--r-- 1 runner docker 8550 Mar 28 17:29 workshop.qmd\n-rw-r--r-- 1 runner docker 8564 Mar 28 17:32 workshop.rmarkdown\ndrwxr-xr-x 3 runner docker 4096 Mar 28 17:29 workshop_files\n\n\nYou can use more than one option at once. The -h option stands for “human readable” and makes the file sizes easier to understand for humans:\n\nls -hl\n\ntotal 128K\ndrwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 data\ndrwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 images\n-rw-r--r-- 1 runner docker 1.6K Mar 28 17:29 overview.qmd\n-rw-r--r-- 1 runner docker 184 Mar 28 17:29 study_after_workshop.qmd\n-rw-r--r-- 1 runner docker 4.7K Mar 28 17:29 study_before_workshop.ipynb\n-rw-r--r-- 1 runner docker 13K Mar 28 17:29 study_before_workshop.qmd\n-rw-r--r-- 1 runner docker 57K Mar 28 17:29 workshop.html\n-rw-r--r-- 1 runner docker 8.4K Mar 28 17:29 workshop.qmd\n-rw-r--r-- 1 runner docker 8.4K Mar 28 17:32 workshop.rmarkdown\ndrwxr-xr-x 3 runner docker 4.0K Mar 28 17:29 workshop_files\n\n\nThe -a option stands for “all” and shows us all the files, including hidden files.\n\nls -alh\n\ntotal 136K\ndrwxr-xr-x 5 runner docker 4.0K Mar 28 17:32 .\ndrwxr-xr-x 6 runner docker 4.0K Mar 28 17:29 ..\ndrwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 data\ndrwxr-xr-x 2 runner docker 4.0K Mar 28 17:29 images\n-rw-r--r-- 1 runner docker 1.6K Mar 28 17:29 overview.qmd\n-rw-r--r-- 1 runner docker 184 Mar 28 17:29 study_after_workshop.qmd\n-rw-r--r-- 1 runner docker 4.7K Mar 28 17:29 study_before_workshop.ipynb\n-rw-r--r-- 1 runner docker 13K Mar 28 17:29 study_before_workshop.qmd\n-rw-r--r-- 1 runner docker 57K Mar 28 17:29 workshop.html\n-rw-r--r-- 1 runner docker 8.4K Mar 28 17:29 workshop.qmd\n-rw-r--r-- 1 runner docker 8.4K Mar 28 17:32 workshop.rmarkdown\ndrwxr-xr-x 3 runner docker 4.0K Mar 28 17:29 workshop_files\n\n\nYou can move about with the cd command, which stands for “change directory”. You can use it to move into a directory by specifying the path to the directory:\n\ncd data\npwd\ncd ..\npwd\ncd data\npwd\n\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2/data\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2/data\n\n\nhead 1cq2.pdb\nHEADER OXYGEN STORAGE/TRANSPORT 04-AUG-99 1CQ2 \nTITLE NEUTRON STRUCTURE OF FULLY DEUTERATED SPERM WHALE MYOGLOBIN AT 2.0 \nTITLE 2 ANGSTROM \nCOMPND MOL_ID: 1; \nCOMPND 2 MOLECULE: MYOGLOBIN; \nCOMPND 3 CHAIN: A; \nCOMPND 4 ENGINEERED: YES; \nCOMPND 5 OTHER_DETAILS: PROTEIN IS FULLY DEUTERATED \nSOURCE MOL_ID: 1; \nSOURCE 2 ORGANISM_SCIENTIFIC: PHYSETER CATODON; \nhead -20 data/1cq2.pdb\nHEADER OXYGEN STORAGE/TRANSPORT 04-AUG-99 1CQ2 \nTITLE NEUTRON STRUCTURE OF FULLY DEUTERATED SPERM WHALE MYOGLOBIN AT 2.0 \nTITLE 2 ANGSTROM \nCOMPND MOL_ID: 1; \nCOMPND 2 MOLECULE: MYOGLOBIN; \nCOMPND 3 CHAIN: A; \nCOMPND 4 ENGINEERED: YES; \nCOMPND 5 OTHER_DETAILS: PROTEIN IS FULLY DEUTERATED \nSOURCE MOL_ID: 1; \nSOURCE 2 ORGANISM_SCIENTIFIC: PHYSETER CATODON; \nSOURCE 3 ORGANISM_COMMON: SPERM WHALE; \nSOURCE 4 ORGANISM_TAXID: 9755; \nSOURCE 5 EXPRESSION_SYSTEM: ESCHERICHIA COLI; \nSOURCE 6 EXPRESSION_SYSTEM_TAXID: 562; \nSOURCE 7 EXPRESSION_SYSTEM_VECTOR_TYPE: PLASMID; \nSOURCE 8 EXPRESSION_SYSTEM_PLASMID: PET15A \nKEYWDS HELICAL, GLOBULAR, ALL-HYDROGEN CONTAINING STRUCTURE, OXYGEN STORAGE- \nKEYWDS 2 TRANSPORT COMPLEX \nEXPDTA NEUTRON DIFFRACTION \nAUTHOR F.SHU,V.RAMAKRISHNAN,B.P.SCHOENBORN \nless 1cq2.pdb\nless is a program that displays the contents of a file, one page at a time. It is useful for viewing large files because it does not load the whole file into memory before displaying it. Instead, it reads and displays a few lines at a time. You can navigate forward through the file with the spacebar, and backwards with the b key. Press q to quit.\nA wildcard is a character that can be used as a substitute for any of a class of characters in a search, The most common wildcard characters are the asterisk (*) and the question mark (?).\nls *.csv\ncp stands for “copy”. You can copy a file from one directory to another by giving cp the path to the file you want to copy and the path to the destination directory.\ncp 1cq2.pdb copy_of_1cq2.pdb\ncp 1cq2.pdb ../copy_of_1cq2.pdb\ncp 1cq2.pdb ../bob.txt\nTo delete a file use the rm command, which stands for “remove”.\nrm ../bob.txt\nbut be careful because the file will be gone forever. There is no “are you sure?” or undo.\nTo move a file from one directory to another, use the mv command. mv works like cp except that it also deletes the original file.\nmv ../copy_of_1cq2.pdb .\nMake a directory\nmkdir mynewdir", + "text": "RStudio terminal\nThe RStudio terminal is a convenient interface to the shell without leaving RStudio. It is useful for running commands that are not available in R. For example, you can use it to run other programs like fasqc, git, ftp, ssh\nNavigating your file system\nSeveral commands are frequently used to create, inspect, rename, and delete files and directories.\n$\nThe dollar sign is the prompt (like > on the R console), which shows us that the shell is waiting for input.\nYou can find out where you are using the pwd command, which stands for “print working directory”.\n\npwd\n\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2\n\n\nYou can find out what you can see with ls which stands for “list”.\n\nls\n\ndata\nimages\noverview.qmd\nstudy_after_workshop.qmd\nstudy_before_workshop.ipynb\nstudy_before_workshop.qmd\nworkshop.html\nworkshop.qmd\nworkshop.rmarkdown\nworkshop_files\n\n\nYou might have noticed that unlike R, the commands do not have brackets after them. Instead, options (or switches) are given after the command. For example, we can modify the ls command to give us more information with the -l option, which stands for “long”.\n\nls -l\n\ntotal 128\ndrwxr-xr-x 2 runner docker 4096 Mar 29 15:16 data\ndrwxr-xr-x 2 runner docker 4096 Mar 29 15:16 images\n-rw-r--r-- 1 runner docker 1597 Mar 29 15:16 overview.qmd\n-rw-r--r-- 1 runner docker 184 Mar 29 15:16 study_after_workshop.qmd\n-rw-r--r-- 1 runner docker 4807 Mar 29 15:16 study_before_workshop.ipynb\n-rw-r--r-- 1 runner docker 13029 Mar 29 15:16 study_before_workshop.qmd\n-rw-r--r-- 1 runner docker 58063 Mar 29 15:16 workshop.html\n-rw-r--r-- 1 runner docker 8550 Mar 29 15:16 workshop.qmd\n-rw-r--r-- 1 runner docker 8564 Mar 29 15:18 workshop.rmarkdown\ndrwxr-xr-x 3 runner docker 4096 Mar 29 15:16 workshop_files\n\n\nYou can use more than one option at once. The -h option stands for “human readable” and makes the file sizes easier to understand for humans:\n\nls -hl\n\ntotal 128K\ndrwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 data\ndrwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 images\n-rw-r--r-- 1 runner docker 1.6K Mar 29 15:16 overview.qmd\n-rw-r--r-- 1 runner docker 184 Mar 29 15:16 study_after_workshop.qmd\n-rw-r--r-- 1 runner docker 4.7K Mar 29 15:16 study_before_workshop.ipynb\n-rw-r--r-- 1 runner docker 13K Mar 29 15:16 study_before_workshop.qmd\n-rw-r--r-- 1 runner docker 57K Mar 29 15:16 workshop.html\n-rw-r--r-- 1 runner docker 8.4K Mar 29 15:16 workshop.qmd\n-rw-r--r-- 1 runner docker 8.4K Mar 29 15:18 workshop.rmarkdown\ndrwxr-xr-x 3 runner docker 4.0K Mar 29 15:16 workshop_files\n\n\nThe -a option stands for “all” and shows us all the files, including hidden files.\n\nls -alh\n\ntotal 136K\ndrwxr-xr-x 5 runner docker 4.0K Mar 29 15:18 .\ndrwxr-xr-x 6 runner docker 4.0K Mar 29 15:16 ..\ndrwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 data\ndrwxr-xr-x 2 runner docker 4.0K Mar 29 15:16 images\n-rw-r--r-- 1 runner docker 1.6K Mar 29 15:16 overview.qmd\n-rw-r--r-- 1 runner docker 184 Mar 29 15:16 study_after_workshop.qmd\n-rw-r--r-- 1 runner docker 4.7K Mar 29 15:16 study_before_workshop.ipynb\n-rw-r--r-- 1 runner docker 13K Mar 29 15:16 study_before_workshop.qmd\n-rw-r--r-- 1 runner docker 57K Mar 29 15:16 workshop.html\n-rw-r--r-- 1 runner docker 8.4K Mar 29 15:16 workshop.qmd\n-rw-r--r-- 1 runner docker 8.4K Mar 29 15:18 workshop.rmarkdown\ndrwxr-xr-x 3 runner docker 4.0K Mar 29 15:16 workshop_files\n\n\nYou can move about with the cd command, which stands for “change directory”. You can use it to move into a directory by specifying the path to the directory:\n\ncd data\npwd\ncd ..\npwd\ncd data\npwd\n\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2/data\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2\n/home/runner/work/BIO00088H-data/BIO00088H-data/core/week-2/data\n\n\nhead 1cq2.pdb\nHEADER OXYGEN STORAGE/TRANSPORT 04-AUG-99 1CQ2 \nTITLE NEUTRON STRUCTURE OF FULLY DEUTERATED SPERM WHALE MYOGLOBIN AT 2.0 \nTITLE 2 ANGSTROM \nCOMPND MOL_ID: 1; \nCOMPND 2 MOLECULE: MYOGLOBIN; \nCOMPND 3 CHAIN: A; \nCOMPND 4 ENGINEERED: YES; \nCOMPND 5 OTHER_DETAILS: PROTEIN IS FULLY DEUTERATED \nSOURCE MOL_ID: 1; \nSOURCE 2 ORGANISM_SCIENTIFIC: PHYSETER CATODON; \nhead -20 data/1cq2.pdb\nHEADER OXYGEN STORAGE/TRANSPORT 04-AUG-99 1CQ2 \nTITLE NEUTRON STRUCTURE OF FULLY DEUTERATED SPERM WHALE MYOGLOBIN AT 2.0 \nTITLE 2 ANGSTROM \nCOMPND MOL_ID: 1; \nCOMPND 2 MOLECULE: MYOGLOBIN; \nCOMPND 3 CHAIN: A; \nCOMPND 4 ENGINEERED: YES; \nCOMPND 5 OTHER_DETAILS: PROTEIN IS FULLY DEUTERATED \nSOURCE MOL_ID: 1; \nSOURCE 2 ORGANISM_SCIENTIFIC: PHYSETER CATODON; \nSOURCE 3 ORGANISM_COMMON: SPERM WHALE; \nSOURCE 4 ORGANISM_TAXID: 9755; \nSOURCE 5 EXPRESSION_SYSTEM: ESCHERICHIA COLI; \nSOURCE 6 EXPRESSION_SYSTEM_TAXID: 562; \nSOURCE 7 EXPRESSION_SYSTEM_VECTOR_TYPE: PLASMID; \nSOURCE 8 EXPRESSION_SYSTEM_PLASMID: PET15A \nKEYWDS HELICAL, GLOBULAR, ALL-HYDROGEN CONTAINING STRUCTURE, OXYGEN STORAGE- \nKEYWDS 2 TRANSPORT COMPLEX \nEXPDTA NEUTRON DIFFRACTION \nAUTHOR F.SHU,V.RAMAKRISHNAN,B.P.SCHOENBORN \nless 1cq2.pdb\nless is a program that displays the contents of a file, one page at a time. It is useful for viewing large files because it does not load the whole file into memory before displaying it. Instead, it reads and displays a few lines at a time. You can navigate forward through the file with the spacebar, and backwards with the b key. Press q to quit.\nA wildcard is a character that can be used as a substitute for any of a class of characters in a search, The most common wildcard characters are the asterisk (*) and the question mark (?).\nls *.csv\ncp stands for “copy”. You can copy a file from one directory to another by giving cp the path to the file you want to copy and the path to the destination directory.\ncp 1cq2.pdb copy_of_1cq2.pdb\ncp 1cq2.pdb ../copy_of_1cq2.pdb\ncp 1cq2.pdb ../bob.txt\nTo delete a file use the rm command, which stands for “remove”.\nrm ../bob.txt\nbut be careful because the file will be gone forever. There is no “are you sure?” or undo.\nTo move a file from one directory to another, use the mv command. mv works like cp except that it also deletes the original file.\nmv ../copy_of_1cq2.pdb .\nMake a directory\nmkdir mynewdir", "crumbs": [ "Core", "Week 2: Workflow tips", @@ -2468,7 +2468,7 @@ "href": "omics/kelly/workshop.html", "title": "Workflow for VFA analysis", "section": "", - "text": "I have some data and information from Kelly. I have interpreted it and written some code to do the calculations.\nHowever, Kelly hasn’t had a chance to look at it yet so I am providing the exact information and data he supplied along with my suggested workflow based on my interpretation of the data and info.\n\nThe file is a CSV file, with some notes on top and the data in the following order, post notes and headers. Please note that all chemical data is in millimolar. There are 62 rows of actual data.\nSample Name – Replicate, Time (days), Acetate, Propanoate, Isobutyrate, Butyrate, Isopentanoate, Pentanoate, Isohexanoate, Hexanoate\nThe students should be able to transform the data from mM to mg/L, and to g/L. To do this they only need to multiply the molecular weight of the compound (listed in the notes in the file) by the concentration in mM to get mg/L. Obviously to get g/L they will just divide by 1000. They should be able to graph the VFA concentrations with time.\nThey should also be able to do a simple flux measurement, which is the change in VFA concentration over a period of time, divided by weight or volume of material. In this case it might be equal to == Delta(Acetate at 3 days - Acetate at 1 day)/Delta (3days - 1day)/50 mls sludge. This would provide a final flux with the units of mg acetate per ml sludge per day. Let me know if this isn’t clear.\nPerhaps more importantly they should be able to graph and extract the reaction rate, assuming a first order chemical/biological reaction and an exponential falloff rate. I found this as a starting point (https://martinlab.chem.umass.edu/r-fitting-data/) , but I assume Emma has something much more effective already in the pipeline.\n\nI created these two data files from the original.\n\n8 VFA in mM for 60 samples vfa.csv. There were 63 rows of data in the original file. There were no time 0 for one treatment and all values were zero for the other treatment so I removed those.\n\nTwo treatments: straw (CN10) and water (NC)\n10 time points: 1, 3, 5, 9, 11, 13, 16, 18, 20, 22\nthree replicates per treatment per time point\n2 x 10 x 3 = 60 groups\n8 VFA with concentration in mM (millimolar): acetate, propanoate, isobutyrate, butyrate, isopentanoate, pentanoate, isohexanoate, hexanoate\n\n\nMolecular weights for each VFA in grams per mole mol_wt.txt VFAs from AD vials\n\nWe need to:\n\nCalculate Change in VFA g/l with time\nRecalculate the data into grams per litre - convert to molar: 1 millimolar to molar = 0.001 molar - multiply by the molecular weight of each VFA\nCalculate the percent representation of each VFA, by mM and by weight\nCalculate the flux (change in VFA concentration over a period of time, divided by weight or volume of material) of each VFA, by mM and by weight\nGraph and extract the reaction rate, assuming a first order chemical/biological reaction and an exponential falloff rate\n\n🎬 Start RStudio from the Start menu\n🎬 Make an RStudio project. Be deliberate about where you create it so that it is a good place for you\n🎬 Use the Files pane to make new folders for the data. I suggest data-raw and data-processed\n🎬 Make a new script called analysis.R to carry out the rest of the work.\n🎬 Load tidyverse (Wickham et al. 2019) for importing, summarising, plotting and filtering.\n\nlibrary(tidyverse)\n\n\n🎬 Save the files to data-raw. Open them and examine them. You may want to use Excel for the csv file.\n🎬 Answer the following questions:\n\nWhat is in the rows and columns of each file?\nHow many rows and columns are there in each file?\nHow are the data organised ?\n\n🎬 Import\n\nvfa_cummul <- read_csv(\"data-raw/vfa.csv\") |> janitor::clean_names()\n\n🎬 Split treatment and replicate to separate columns so there is a treatment column:\n\nvfa_cummul <- vfa_cummul |> \n separate(col = sample_replicate, \n into = c(\"treatment\", \"replicate\"), \n sep = \"-\",\n remove = FALSE)\n\nThe provided data is cumulative/absolute. We need to calculate the change in VFA with time. There is a function, lag() that will help us do this. It will take the previous value and subtract it from the current value. We need to do that separately for each sample_replicate so we need to group by sample_replicate first. We also need to make sure the data is in the right order so we will arrange by sample_replicate and time_day.\n\n🎬 Create dataframe for the change in VFA\n\nvfa_delta <- vfa_cummul |> \n group_by(sample_replicate) |> \n arrange(sample_replicate, time_day) |>\n mutate(acetate = acetate - lag(acetate),\n propanoate = propanoate - lag(propanoate),\n isobutyrate = isobutyrate - lag(isobutyrate),\n butyrate = butyrate - lag(butyrate),\n isopentanoate = isopentanoate - lag(isopentanoate),\n pentanoate = pentanoate - lag(pentanoate),\n isohexanoate = isohexanoate - lag(isohexanoate),\n hexanoate = hexanoate - lag(hexanoate))\n\nNow we have two dataframes, one for the cumulative data and one for the change in VFA.\n\nTo make conversions from mM to g/l we need to do mM * 0.001 * MW. We will import the molecular weight data, pivot the VFA data to long format and join the molecular weight data to the VFA data. Then we can calculate the g/l. We will do this for both the cumulative and delta dataframes.\n🎬 import molecular weight data\n\nmol_wt <- read_table(\"data-raw/mol_wt.txt\") |>\n mutate(vfa = tolower(vfa))\n\n🎬 Pivot the cumulative data to long format:\n\nvfa_cummul <- vfa_cummul |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_cummul to check you understand what you have done.\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nvfa_cummul <- vfa_cummul |> \n left_join(mol_wt, by = \"vfa\") |>\n mutate(conc_g_l = conc_mM * 0.001 * mw)\n\nView vfa_cummul to check you understand what you have done.\nRepeat for the delta data.\n🎬 Pivot the change data, delta_vfa to long format:\n\nvfa_delta <- vfa_delta |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_delta to check it looks like vfa_cummul\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nby mM and by weight\n🎬 Add a column which is the percent representation of each VFA for mM and g/l:\n\nvfa_cummul <- vfa_cummul |> \n group_by(sample_replicate, time_day) |> \n mutate(percent_conc_g_l = conc_g_l / sum(conc_g_l) * 100,\n percent_conc_mM = conc_mM / sum(conc_mM) * 100)\n\n\n🎬 Make summary data for graphing\n\nvfa_cummul_summary <- vfa_cummul |> \n group_by(treatment, time_day, vfa) |> \n summarise(mean_g_l = mean(conc_g_l),\n se_g_l = sd(conc_g_l)/sqrt(length(conc_g_l)),\n mean_mM = mean(conc_mM),\n se_mM = sd(conc_mM)/sqrt(length(conc_mM))) |> \n ungroup()\n\n\nvfa_delta_summary <- vfa_delta |> \n group_by(treatment, time_day, vfa) |> \n summarise(mean_g_l = mean(conc_g_l),\n se_g_l = sd(conc_g_l)/sqrt(length(conc_g_l)),\n mean_mM = mean(conc_mM),\n se_mM = sd(conc_mM)/sqrt(length(conc_mM))) |> \n ungroup()\n\n🎬 Graph the cumulative data, grams per litre:\n\nvfa_cummul_summary |> \n ggplot(aes(x = time_day, colour = vfa)) +\n geom_line(aes(y = mean_g_l), \n linewidth = 1) +\n geom_errorbar(aes(ymin = mean_g_l - se_g_l,\n ymax = mean_g_l + se_g_l),\n width = 0.5, \n show.legend = F,\n linewidth = 1) +\n scale_color_viridis_d(name = NULL) +\n scale_x_continuous(name = \"Time (days)\") +\n scale_y_continuous(name = \"Mean VFA concentration (g/l)\") +\n theme_bw() +\n facet_wrap(~treatment) +\n theme(strip.background = element_blank())\n\n\n\n\n\n\n\n🎬 Graph the change data, grams per litre:\n\nvfa_delta_summary |> \n ggplot(aes(x = time_day, colour = vfa)) +\n geom_line(aes(y = mean_g_l), \n linewidth = 1) +\n geom_errorbar(aes(ymin = mean_g_l - se_g_l,\n ymax = mean_g_l + se_g_l),\n width = 0.5, \n show.legend = F,\n linewidth = 1) +\n scale_color_viridis_d(name = NULL) +\n scale_x_continuous(name = \"Time (days)\") +\n scale_y_continuous(name = \"Mean change in VFA concentration (g/l)\") +\n theme_bw() +\n facet_wrap(~treatment) +\n theme(strip.background = element_blank())\n\n\n\n\n\n\n\n🎬 Graph the mean percent representation of each VFA g/l. Note geom_col() will plot proportion if we setposition = \"fill\"\n\nvfa_cummul_summary |> \n ggplot(aes(x = time_day, y = mean_g_l, fill = vfa)) +\n geom_col(position = \"fill\") +\n scale_fill_viridis_d(name = NULL) +\n scale_x_continuous(name = \"Time (days)\") +\n scale_y_continuous(name = \"Mean Proportion VFA\") +\n theme_bw() +\n facet_wrap(~treatment) +\n theme(strip.background = element_blank())\n\n\n\n\n\n\n\n\nWe have 8 VFA in our dataset. PCA will allow us to plot our samples in the “VFA” space so we can see if treatments, time or replicate cluster.\nHowever, PCA expects a matrix with samples in rows and VFA, the variables, in columns. We will need to select the columns we need and pivot wider. Then convert to a matrix.\n🎬\n\nvfa_cummul_pca <- vfa_cummul |> \n select(sample_replicate, \n treatment, \n replicate, \n time_day, \n vfa, \n conc_g_l) |> \n pivot_wider(names_from = vfa, \n values_from = conc_g_l)\n\n\nmat <- vfa_cummul_pca |> \n ungroup() |>\n select(-sample_replicate, \n -treatment, \n -replicate, \n -time_day) |> \n as.matrix()\n\n🎬 Perform PCA on the matrix:\n\npca <- mat |>\n prcomp(scale. = TRUE, \n rank. = 4) \n\nThe scale. argument tells prcomp() to scale the data to have a mean of 0 and a standard deviation of 1. The rank. argument tells prcomp() to only calculate the first 4 principal components. This is useful for visualisation as we can only plot in 2 or 3 dimensions. We can see the results of the PCA by viewing the summary() of the pca object.\n\nsummary(pca)\n\nImportance of first k=4 (out of 8) components:\n PC1 PC2 PC3 PC4\nStandard deviation 2.4977 0.9026 0.77959 0.45567\nProportion of Variance 0.7798 0.1018 0.07597 0.02595\nCumulative Proportion 0.7798 0.8816 0.95760 0.98355\n\n\nThe Proportion of Variance tells us how much of the variance is explained by each component. We can see that the first component explains 0.7798 of the variance, the second 0.1018, and the third 0.07597. Together the first three components explain nearly 96% of the total variance in the data. Plotting PC1 against PC2 will capture about 78% of the variance which is likely much better than we would get plotting any two VFA against each other. To plot the PC1 against PC2 we will need to extract the PC1 and PC2 score from the pca object and add labels for the samples.\n🎬 Create a dataframe of the PC1 and PC2 scores which are in pca$x and add the sample information from vfa_cummul_pca:\n\npca_labelled <- data.frame(pca$x,\n sample_replicate = vfa_cummul_pca$sample_replicate,\n treatment = vfa_cummul_pca$treatment,\n replicate = vfa_cummul_pca$replicate,\n time_day = vfa_cummul_pca$time_day) \n\nThe dataframe should look like this:\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nPC1\nPC2\nPC3\nPC4\nsample_replicate\ntreatment\nreplicate\ntime_day\n\n\n\n-2.9592362\n0.6710553\n0.0068846\n-0.4453904\nCN10-1\nCN10\n1\n1\n\n\n-2.7153060\n0.7338367\n-0.2856872\n-0.2030110\nCN10-2\nCN10\n2\n1\n\n\n-2.7423102\n0.8246832\n-0.4964249\n-0.1434490\nCN10-3\nCN10\n3\n1\n\n\n-1.1909064\n-1.0360724\n1.1249513\n-0.7360599\nCN10-1\nCN10\n1\n3\n\n\n-1.3831563\n0.9572091\n-1.5561657\n0.0582755\nCN10-2\nCN10\n2\n3\n\n\n-1.1628940\n-0.0865412\n-0.6046780\n-0.1976743\nCN10-3\nCN10\n3\n3\n\n\n-0.2769661\n-0.2221055\n1.1579897\n-0.6079395\nCN10-1\nCN10\n1\n5\n\n\n0.3480962\n0.3612522\n0.5841649\n-0.0612366\nCN10-2\nCN10\n2\n5\n\n\n-0.7281116\n1.6179706\n-0.6430170\n0.0660727\nCN10-3\nCN10\n3\n5\n\n\n0.9333578\n-0.1339061\n1.0870945\n-0.4374103\nCN10-1\nCN10\n1\n9\n\n\n2.0277528\n0.6993342\n0.3850147\n0.0723540\nCN10-2\nCN10\n2\n9\n\n\n1.9931908\n0.5127260\n0.6605782\n0.1841974\nCN10-3\nCN10\n3\n9\n\n\n1.8365692\n-0.4189762\n0.7029015\n-0.3873133\nCN10-1\nCN10\n1\n11\n\n\n2.3313978\n0.3274834\n-0.0135608\n0.0264372\nCN10-2\nCN10\n2\n11\n\n\n1.5833035\n0.9263509\n-0.1909483\n0.1358320\nCN10-3\nCN10\n3\n11\n\n\n2.8498246\n0.3815854\n-0.4763500\n-0.0280281\nCN10-1\nCN10\n1\n13\n\n\n3.5652461\n-0.0836709\n-0.5948483\n-0.1612809\nCN10-2\nCN10\n2\n13\n\n\n4.1314944\n-1.2254642\n0.2699666\n-0.3152100\nCN10-3\nCN10\n3\n13\n\n\n3.7338024\n-0.6744610\n0.4344639\n-0.3736234\nCN10-1\nCN10\n1\n16\n\n\n3.6748427\n0.5202498\n-0.4333685\n-0.1607235\nCN10-2\nCN10\n2\n16\n\n\n3.9057053\n0.3599520\n-0.3049074\n0.0540037\nCN10-3\nCN10\n3\n16\n\n\n3.4561583\n-0.0996639\n0.4472090\n-0.0185889\nCN10-1\nCN10\n1\n18\n\n\n3.6354729\n0.3809673\n-0.0934957\n0.0018722\nCN10-2\nCN10\n2\n18\n\n\n2.9872250\n0.7890400\n-0.2361098\n-0.1628506\nCN10-3\nCN10\n3\n18\n\n\n3.3562231\n-0.2866224\n0.1331068\n-0.2056366\nCN10-1\nCN10\n1\n20\n\n\n3.2009943\n0.4795967\n-0.2092384\n-0.5962183\nCN10-2\nCN10\n2\n20\n\n\n3.9948127\n0.7772640\n-0.3181372\n0.1218382\nCN10-3\nCN10\n3\n20\n\n\n2.8874207\n0.4554681\n0.3106044\n-0.2220240\nCN10-1\nCN10\n1\n22\n\n\n3.6868864\n0.9681097\n-0.2174166\n-0.2246775\nCN10-2\nCN10\n2\n22\n\n\n4.8689622\n0.5218563\n-0.2906042\n0.3532981\nCN10-3\nCN10\n3\n22\n\n\n-3.8483418\n1.5205541\n-0.8809715\n-0.5306228\nNC-1\nNC\n1\n1\n\n\n-3.7653460\n1.5598499\n-1.0570798\n-0.4075397\nNC-2\nNC\n2\n1\n\n\n-3.8586309\n1.6044929\n-1.0936576\n-0.4292404\nNC-3\nNC\n3\n1\n\n\n-2.6934553\n-0.9198406\n0.7439841\n-0.9881115\nNC-1\nNC\n1\n3\n\n\n-2.5064076\n-1.0856761\n0.6334250\n-0.8999028\nNC-2\nNC\n2\n3\n\n\n-2.4097945\n-1.2731546\n1.1767665\n-0.8715948\nNC-3\nNC\n3\n3\n\n\n-3.0567309\n0.5804906\n-0.1391344\n-0.3701763\nNC-1\nNC\n1\n5\n\n\n-2.3511737\n-0.3692016\n0.7053757\n-0.3284113\nNC-2\nNC\n2\n5\n\n\n-2.6752311\n-0.0637855\n0.4692194\n-0.3841240\nNC-3\nNC\n3\n5\n\n\n-1.2335368\n-0.6717374\n0.2155285\n0.1060486\nNC-1\nNC\n1\n9\n\n\n-1.6550689\n0.1576557\n0.0687658\n0.2750388\nNC-2\nNC\n2\n9\n\n\n-0.8948103\n-0.8171884\n0.8062876\n0.5032756\nNC-3\nNC\n3\n9\n\n\n-1.2512737\n-0.4720993\n0.4071788\n0.4693106\nNC-1\nNC\n1\n11\n\n\n-1.8091407\n0.0552546\n0.0424090\n0.3918222\nNC-2\nNC\n2\n11\n\n\n-2.4225566\n0.4998948\n-0.1987773\n0.1959282\nNC-3\nNC\n3\n11\n\n\n-0.9193427\n-0.7741826\n0.0918984\n0.5089847\nNC-1\nNC\n1\n13\n\n\n-0.8800183\n-0.7850404\n0.0895146\n0.6050052\nNC-2\nNC\n2\n13\n\n\n-1.3075763\n-0.2525829\n-0.2993318\n0.5874269\nNC-3\nNC\n3\n13\n\n\n-0.9543813\n-0.3170305\n0.0885062\n0.7153071\nNC-1\nNC\n1\n16\n\n\n-0.4303679\n-0.9952374\n0.2038883\n0.8214647\nNC-2\nNC\n2\n16\n\n\n-0.9457300\n-0.7180646\n0.3081282\n0.6563748\nNC-3\nNC\n3\n16\n\n\n-1.3830063\n0.0614677\n-0.2805342\n0.5462137\nNC-1\nNC\n1\n18\n\n\n-0.7960522\n-0.5792768\n-0.0369684\n0.6621526\nNC-2\nNC\n2\n18\n\n\n-1.6822927\n0.1041656\n0.0634251\n0.4337240\nNC-3\nNC\n3\n18\n\n\n-1.3157478\n-0.0835664\n-0.1246253\n0.5599467\nNC-1\nNC\n1\n20\n\n\n-1.7425068\n0.3029227\n-0.0161466\n0.5134360\nNC-2\nNC\n2\n20\n\n\n-1.3970678\n-0.2923056\n0.4324586\n0.4765460\nNC-3\nNC\n3\n20\n\n\n-1.0777451\n-0.1232925\n0.2388682\n0.7585307\nNC-1\nNC\n1\n22\n\n\n0.4851039\n-4.1291445\n-4.0625050\n-0.4582436\nNC-2\nNC\n2\n22\n\n\n-1.0516226\n-0.7228479\n1.0641320\n0.4955951\nNC-3\nNC\n3\n22\n\n\n\n\n\n🎬 Plot PC1 against PC2 and colour by time and shape by treatment:\n\npca_labelled |> \n ggplot(aes(x = PC1, y = PC2, \n colour = factor(time_day),\n shape = treatment)) +\n geom_point(size = 3) +\n scale_colour_viridis_d(end = 0.95, begin = 0.15,\n name = \"Time\") +\n scale_shape_manual(values = c(17, 19),\n name = NULL) +\n theme_classic()\n\n\n\n\n\n\n\n🎬 Plot PC1 against PC2 and colour by time and facet treatment:\n\npca_labelled |> \n ggplot(aes(x = PC1, y = PC2, colour = factor(time_day))) +\n geom_point(size = 3) +\n scale_colour_viridis_d(end = 0.95, begin = 0.15,\n name = \"Time\") +\n facet_wrap(~treatment, ncol = 1) +\n theme_classic()\n\n\n\n\n\n\n\nreplicates are similar at the same time and treatment especially early as we might expect. PC is essentially an axis of time.\n\nWe are going to create an interactive heatmap with the heatmaply (Galili et al. 2017) package. heatmaply takes a matrix as input so we can use mat\n🎬 Set the rownames to the sample id whihcih is combination of sample_replicate and time_day:\n\nrownames(mat) <- interaction(vfa_cummul_pca$sample_replicate, \n vfa_cummul_pca$time_day)\n\nYou might want to view the matrix by clicking on it in the environment pane.\n🎬 Load the heatmaply package:\n\nlibrary(heatmaply)\n\nWe need to tell the clustering algorithm how many clusters to create. We will set the number of clusters for the treatments to be 2 and the number of clusters for the vfa to be the same since it makes sense to see what clusters of genes correlate with the treatments.\n🎬 Set the number of clusters for the treatments and vfa:\n\nn_treatment_clusters <- 2\nn_vfa_clusters <- 2\n\n🎬 Create the heatmap:\n\nheatmaply(mat, \n scale = \"column\",\n k_col = n_vfa_clusters,\n k_row = n_treatment_clusters,\n fontsize_row = 7, fontsize_col = 10,\n labCol = colnames(mat),\n labRow = rownames(mat),\n heatmap_layers = theme(axis.line = element_blank()))\n\n\n\n\n\nThe heatmap will open in the viewer pane (rather than the plot pane) because it is html. You can “Show in a new window” to see it in a larger format. You can also zoom in and out and pan around the heatmap and download it as a png. You might feel the colour bars is not adding much to the plot. You can remove it by setting hide_colorbar = TRUE, in the heatmaply() function.\nOne of the NC replicates at time = 22 is very different from the other replicates. The CN10 treatments cluster together at high time points. CN10 samples are more similar to NC samples early on. Most of the VFAs behave similarly with highest values later in the experiment for CN10 but isohexanoate and hexanoate differ. The difference might be because isohexanoate is especially low in the NC replicates at time = 1 and hexanoate is especially high in the NC replicate 2 at time = 22\n\nCalculate the flux(change in VFA concentration over a period of time, divided by weight or volume of material) of each VFA, by mM and by weight.\nI’ve requested clarification: for the flux measurements, do they need graphs of the rate of change wrt time? And is the sludge volume going to be a constant for all samples or something they measure and varies by vial?\n\nGraph and extract the reaction rate assuming a first order chemical/biological reaction and an exponential falloff rate\nI’ve requested clarification: for the nonlinear least squares curve fitting, I assume x is time but I’m not clear what the Y variable is - concentration? or change in concentration? or rate of change of concentration?\nPages made with R (R Core Team 2023), Quarto (Allaire et al. 2022), knitr (Xie 2022), kableExtra (Zhu 2021)", + "text": "I have some data and information from Kelly. I have interpreted it and written some code to do the calculations.\nHowever, Kelly hasn’t had a chance to look at it yet so I am providing the exact information and data he supplied along with my suggested workflow based on my interpretation of the data and info.\n\nThe file is a CSV file, with some notes on top and the data in the following order, post notes and headers. Please note that all chemical data is in millimolar. There are 62 rows of actual data.\nSample Name – Replicate, Time (days), Acetate, Propanoate, Isobutyrate, Butyrate, Isopentanoate, Pentanoate, Isohexanoate, Hexanoate\nThe students should be able to transform the data from mM to mg/L, and to g/L. To do this they only need to multiply the molecular weight of the compound (listed in the notes in the file) by the concentration in mM to get mg/L. Obviously to get g/L they will just divide by 1000. They should be able to graph the VFA concentrations with time.\nThey should also be able to do a simple flux measurement, which is the change in VFA concentration over a period of time, divided by weight or volume of material. In this case it might be equal to == Delta(Acetate at 3 days - Acetate at 1 day)/Delta (3days - 1day)/50 mls sludge. This would provide a final flux with the units of mg acetate per ml sludge per day. Let me know if this isn’t clear.\nPerhaps more importantly they should be able to graph and extract the reaction rate, assuming a first order chemical/biological reaction and an exponential falloff rate. I found this as a starting point (https://martinlab.chem.umass.edu/r-fitting-data/) , but I assume Emma has something much more effective already in the pipeline.\n\nI created these two data files from the original.\n\n8 VFA in mM for 60 samples vfa.csv. There were 63 rows of data in the original file. There were no time 0 for one treatment and all values were zero for the other treatment so I removed those.\n\nTwo treatments: straw (CN10) and water (NC)\n10 time points: 1, 3, 5, 9, 11, 13, 16, 18, 20, 22\nthree replicates per treatment per time point\n2 x 10 x 3 = 60 groups\n8 VFA with concentration in mM (millimolar): acetate, propanoate, isobutyrate, butyrate, isopentanoate, pentanoate, isohexanoate, hexanoate\n\n\nMolecular weights for each VFA in grams per mole mol_wt.txt VFAs from AD vials\n\nWe need to:\n\nCalculate Change in VFA g/l with time\nRecalculate the data into grams per litre - convert to molar: 1 millimolar to molar = 0.001 molar - multiply by the molecular weight of each VFA\nCalculate the percent representation of each VFA, by mM and by weight\nCalculate the flux (change in VFA concentration over a period of time, divided by weight or volume of material) of each VFA, by mM and by weight\nGraph and extract the reaction rate, assuming a first order chemical/biological reaction and an exponential falloff rate\n\n🎬 Start RStudio from the Start menu\n🎬 Make an RStudio project. Be deliberate about where you create it so that it is a good place for you\n🎬 Use the Files pane to make new folders for the data. I suggest data-raw and data-processed\n🎬 Make a new script called analysis.R to carry out the rest of the work.\n🎬 Load tidyverse (Wickham et al. 2019) for importing, summarising, plotting and filtering.\n\nlibrary(tidyverse)\n\n\n🎬 Save the files to data-raw. Open them and examine them. You may want to use Excel for the csv file.\n🎬 Answer the following questions:\n\nWhat is in the rows and columns of each file?\nHow many rows and columns are there in each file?\nHow are the data organised ?\n\n🎬 Import\n\nvfa_cummul <- read_csv(\"data-raw/vfa.csv\") |> janitor::clean_names()\n\n🎬 Split treatment and replicate to separate columns so there is a treatment column:\n\nvfa_cummul <- vfa_cummul |> \n separate(col = sample_replicate, \n into = c(\"treatment\", \"replicate\"), \n sep = \"-\",\n remove = FALSE)\n\nThe provided data is cumulative/absolute. We need to calculate the change in VFA with time. There is a function, lag() that will help us do this. It will take the previous value and subtract it from the current value. We need to do that separately for each sample_replicate so we need to group by sample_replicate first. We also need to make sure the data is in the right order so we will arrange by sample_replicate and time_day.\n\n🎬 Create dataframe for the change in VFA\n\nvfa_delta <- vfa_cummul |> \n group_by(sample_replicate) |> \n arrange(sample_replicate, time_day) |>\n mutate(acetate = acetate - lag(acetate),\n propanoate = propanoate - lag(propanoate),\n isobutyrate = isobutyrate - lag(isobutyrate),\n butyrate = butyrate - lag(butyrate),\n isopentanoate = isopentanoate - lag(isopentanoate),\n pentanoate = pentanoate - lag(pentanoate),\n isohexanoate = isohexanoate - lag(isohexanoate),\n hexanoate = hexanoate - lag(hexanoate))\n\nNow we have two dataframes, one for the cumulative data and one for the change in VFA.\n\nTo make conversions from mM to g/l we need to do mM * 0.001 * MW. We will import the molecular weight data, pivot the VFA data to long format and join the molecular weight data to the VFA data. Then we can calculate the g/l. We will do this for both the cumulative and delta dataframes.\n🎬 import molecular weight data\n\nmol_wt <- read_table(\"data-raw/mol_wt.txt\") |>\n mutate(vfa = tolower(vfa))\n\n🎬 Pivot the cumulative data to long format:\n\nvfa_cummul <- vfa_cummul |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_cummul to check you understand what you have done.\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nvfa_cummul <- vfa_cummul |> \n left_join(mol_wt, by = \"vfa\") |>\n mutate(conc_g_l = conc_mM * 0.001 * mw)\n\nView vfa_cummul to check you understand what you have done.\nRepeat for the delta data.\n🎬 Pivot the change data, delta_vfa to long format:\n\nvfa_delta <- vfa_delta |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_delta to check it looks like vfa_cummul\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nvfa_delta <- vfa_delta |> \n left_join(mol_wt, by = \"vfa\") |>\n mutate(conc_g_l = conc_mM * 0.001 * mw)\n\n\nby mM and by weight\n🎬 Add a column which is the percent representation of each VFA for mM and g/l:\n\nvfa_cummul <- vfa_cummul |> \n group_by(sample_replicate, time_day) |> \n mutate(percent_conc_g_l = conc_g_l / sum(conc_g_l) * 100,\n percent_conc_mM = conc_mM / sum(conc_mM) * 100)\n\n\n🎬 Make summary data for graphing\n\nvfa_cummul_summary <- vfa_cummul |> \n group_by(treatment, time_day, vfa) |> \n summarise(mean_g_l = mean(conc_g_l),\n se_g_l = sd(conc_g_l)/sqrt(length(conc_g_l)),\n mean_mM = mean(conc_mM),\n se_mM = sd(conc_mM)/sqrt(length(conc_mM))) |> \n ungroup()\n\n\nvfa_delta_summary <- vfa_delta |> \n group_by(treatment, time_day, vfa) |> \n summarise(mean_g_l = mean(conc_g_l),\n se_g_l = sd(conc_g_l)/sqrt(length(conc_g_l)),\n mean_mM = mean(conc_mM),\n se_mM = sd(conc_mM)/sqrt(length(conc_mM))) |> \n ungroup()\n\n🎬 Graph the cumulative data, grams per litre:\n\nvfa_cummul_summary |> \n ggplot(aes(x = time_day, colour = vfa)) +\n geom_line(aes(y = mean_g_l), \n linewidth = 1) +\n geom_errorbar(aes(ymin = mean_g_l - se_g_l,\n ymax = mean_g_l + se_g_l),\n width = 0.5, \n show.legend = F,\n linewidth = 1) +\n scale_color_viridis_d(name = NULL) +\n scale_x_continuous(name = \"Time (days)\") +\n scale_y_continuous(name = \"Mean VFA concentration (g/l)\") +\n theme_bw() +\n facet_wrap(~treatment) +\n theme(strip.background = element_blank())\n\n\n\n\n\n\n\n🎬 Graph the change data, grams per litre:\n\nvfa_delta_summary |> \n ggplot(aes(x = time_day, colour = vfa)) +\n geom_line(aes(y = mean_g_l), \n linewidth = 1) +\n geom_errorbar(aes(ymin = mean_g_l - se_g_l,\n ymax = mean_g_l + se_g_l),\n width = 0.5, \n show.legend = F,\n linewidth = 1) +\n scale_color_viridis_d(name = NULL) +\n scale_x_continuous(name = \"Time (days)\") +\n scale_y_continuous(name = \"Mean change in VFA concentration (g/l)\") +\n theme_bw() +\n facet_wrap(~treatment) +\n theme(strip.background = element_blank())\n\n\n\n\n\n\n\n🎬 Graph the mean percent representation of each VFA g/l. Note geom_col() will plot proportion if we setposition = \"fill\"\n\nvfa_cummul_summary |> \n ggplot(aes(x = time_day, y = mean_g_l, fill = vfa)) +\n geom_col(position = \"fill\") +\n scale_fill_viridis_d(name = NULL) +\n scale_x_continuous(name = \"Time (days)\") +\n scale_y_continuous(name = \"Mean Proportion VFA\") +\n theme_bw() +\n facet_wrap(~treatment) +\n theme(strip.background = element_blank())\n\n\n\n\n\n\n\n\nWe have 8 VFA in our dataset. PCA will allow us to plot our samples in the “VFA” space so we can see if treatments, time or replicate cluster.\nHowever, PCA expects a matrix with samples in rows and VFA, the variables, in columns. We will need to select the columns we need and pivot wider. Then convert to a matrix.\n🎬\n\nvfa_cummul_pca <- vfa_cummul |> \n select(sample_replicate, \n treatment, \n replicate, \n time_day, \n vfa, \n conc_g_l) |> \n pivot_wider(names_from = vfa, \n values_from = conc_g_l)\n\n\nmat <- vfa_cummul_pca |> \n ungroup() |>\n select(-sample_replicate, \n -treatment, \n -replicate, \n -time_day) |> \n as.matrix()\n\n🎬 Perform PCA on the matrix:\n\npca <- mat |>\n prcomp(scale. = TRUE, \n rank. = 4) \n\nThe scale. argument tells prcomp() to scale the data to have a mean of 0 and a standard deviation of 1. The rank. argument tells prcomp() to only calculate the first 4 principal components. This is useful for visualisation as we can only plot in 2 or 3 dimensions. We can see the results of the PCA by viewing the summary() of the pca object.\n\nsummary(pca)\n\nImportance of first k=4 (out of 8) components:\n PC1 PC2 PC3 PC4\nStandard deviation 2.4977 0.9026 0.77959 0.45567\nProportion of Variance 0.7798 0.1018 0.07597 0.02595\nCumulative Proportion 0.7798 0.8816 0.95760 0.98355\n\n\nThe Proportion of Variance tells us how much of the variance is explained by each component. We can see that the first component explains 0.7798 of the variance, the second 0.1018, and the third 0.07597. Together the first three components explain nearly 96% of the total variance in the data. Plotting PC1 against PC2 will capture about 78% of the variance which is likely much better than we would get plotting any two VFA against each other. To plot the PC1 against PC2 we will need to extract the PC1 and PC2 score from the pca object and add labels for the samples.\n🎬 Create a dataframe of the PC1 and PC2 scores which are in pca$x and add the sample information from vfa_cummul_pca:\n\npca_labelled <- data.frame(pca$x,\n sample_replicate = vfa_cummul_pca$sample_replicate,\n treatment = vfa_cummul_pca$treatment,\n replicate = vfa_cummul_pca$replicate,\n time_day = vfa_cummul_pca$time_day) \n\nThe dataframe should look like this:\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nPC1\nPC2\nPC3\nPC4\nsample_replicate\ntreatment\nreplicate\ntime_day\n\n\n\n-2.9592362\n0.6710553\n0.0068846\n-0.4453904\nCN10-1\nCN10\n1\n1\n\n\n-2.7153060\n0.7338367\n-0.2856872\n-0.2030110\nCN10-2\nCN10\n2\n1\n\n\n-2.7423102\n0.8246832\n-0.4964249\n-0.1434490\nCN10-3\nCN10\n3\n1\n\n\n-1.1909064\n-1.0360724\n1.1249513\n-0.7360599\nCN10-1\nCN10\n1\n3\n\n\n-1.3831563\n0.9572091\n-1.5561657\n0.0582755\nCN10-2\nCN10\n2\n3\n\n\n-1.1628940\n-0.0865412\n-0.6046780\n-0.1976743\nCN10-3\nCN10\n3\n3\n\n\n-0.2769661\n-0.2221055\n1.1579897\n-0.6079395\nCN10-1\nCN10\n1\n5\n\n\n0.3480962\n0.3612522\n0.5841649\n-0.0612366\nCN10-2\nCN10\n2\n5\n\n\n-0.7281116\n1.6179706\n-0.6430170\n0.0660727\nCN10-3\nCN10\n3\n5\n\n\n0.9333578\n-0.1339061\n1.0870945\n-0.4374103\nCN10-1\nCN10\n1\n9\n\n\n2.0277528\n0.6993342\n0.3850147\n0.0723540\nCN10-2\nCN10\n2\n9\n\n\n1.9931908\n0.5127260\n0.6605782\n0.1841974\nCN10-3\nCN10\n3\n9\n\n\n1.8365692\n-0.4189762\n0.7029015\n-0.3873133\nCN10-1\nCN10\n1\n11\n\n\n2.3313978\n0.3274834\n-0.0135608\n0.0264372\nCN10-2\nCN10\n2\n11\n\n\n1.5833035\n0.9263509\n-0.1909483\n0.1358320\nCN10-3\nCN10\n3\n11\n\n\n2.8498246\n0.3815854\n-0.4763500\n-0.0280281\nCN10-1\nCN10\n1\n13\n\n\n3.5652461\n-0.0836709\n-0.5948483\n-0.1612809\nCN10-2\nCN10\n2\n13\n\n\n4.1314944\n-1.2254642\n0.2699666\n-0.3152100\nCN10-3\nCN10\n3\n13\n\n\n3.7338024\n-0.6744610\n0.4344639\n-0.3736234\nCN10-1\nCN10\n1\n16\n\n\n3.6748427\n0.5202498\n-0.4333685\n-0.1607235\nCN10-2\nCN10\n2\n16\n\n\n3.9057053\n0.3599520\n-0.3049074\n0.0540037\nCN10-3\nCN10\n3\n16\n\n\n3.4561583\n-0.0996639\n0.4472090\n-0.0185889\nCN10-1\nCN10\n1\n18\n\n\n3.6354729\n0.3809673\n-0.0934957\n0.0018722\nCN10-2\nCN10\n2\n18\n\n\n2.9872250\n0.7890400\n-0.2361098\n-0.1628506\nCN10-3\nCN10\n3\n18\n\n\n3.3562231\n-0.2866224\n0.1331068\n-0.2056366\nCN10-1\nCN10\n1\n20\n\n\n3.2009943\n0.4795967\n-0.2092384\n-0.5962183\nCN10-2\nCN10\n2\n20\n\n\n3.9948127\n0.7772640\n-0.3181372\n0.1218382\nCN10-3\nCN10\n3\n20\n\n\n2.8874207\n0.4554681\n0.3106044\n-0.2220240\nCN10-1\nCN10\n1\n22\n\n\n3.6868864\n0.9681097\n-0.2174166\n-0.2246775\nCN10-2\nCN10\n2\n22\n\n\n4.8689622\n0.5218563\n-0.2906042\n0.3532981\nCN10-3\nCN10\n3\n22\n\n\n-3.8483418\n1.5205541\n-0.8809715\n-0.5306228\nNC-1\nNC\n1\n1\n\n\n-3.7653460\n1.5598499\n-1.0570798\n-0.4075397\nNC-2\nNC\n2\n1\n\n\n-3.8586309\n1.6044929\n-1.0936576\n-0.4292404\nNC-3\nNC\n3\n1\n\n\n-2.6934553\n-0.9198406\n0.7439841\n-0.9881115\nNC-1\nNC\n1\n3\n\n\n-2.5064076\n-1.0856761\n0.6334250\n-0.8999028\nNC-2\nNC\n2\n3\n\n\n-2.4097945\n-1.2731546\n1.1767665\n-0.8715948\nNC-3\nNC\n3\n3\n\n\n-3.0567309\n0.5804906\n-0.1391344\n-0.3701763\nNC-1\nNC\n1\n5\n\n\n-2.3511737\n-0.3692016\n0.7053757\n-0.3284113\nNC-2\nNC\n2\n5\n\n\n-2.6752311\n-0.0637855\n0.4692194\n-0.3841240\nNC-3\nNC\n3\n5\n\n\n-1.2335368\n-0.6717374\n0.2155285\n0.1060486\nNC-1\nNC\n1\n9\n\n\n-1.6550689\n0.1576557\n0.0687658\n0.2750388\nNC-2\nNC\n2\n9\n\n\n-0.8948103\n-0.8171884\n0.8062876\n0.5032756\nNC-3\nNC\n3\n9\n\n\n-1.2512737\n-0.4720993\n0.4071788\n0.4693106\nNC-1\nNC\n1\n11\n\n\n-1.8091407\n0.0552546\n0.0424090\n0.3918222\nNC-2\nNC\n2\n11\n\n\n-2.4225566\n0.4998948\n-0.1987773\n0.1959282\nNC-3\nNC\n3\n11\n\n\n-0.9193427\n-0.7741826\n0.0918984\n0.5089847\nNC-1\nNC\n1\n13\n\n\n-0.8800183\n-0.7850404\n0.0895146\n0.6050052\nNC-2\nNC\n2\n13\n\n\n-1.3075763\n-0.2525829\n-0.2993318\n0.5874269\nNC-3\nNC\n3\n13\n\n\n-0.9543813\n-0.3170305\n0.0885062\n0.7153071\nNC-1\nNC\n1\n16\n\n\n-0.4303679\n-0.9952374\n0.2038883\n0.8214647\nNC-2\nNC\n2\n16\n\n\n-0.9457300\n-0.7180646\n0.3081282\n0.6563748\nNC-3\nNC\n3\n16\n\n\n-1.3830063\n0.0614677\n-0.2805342\n0.5462137\nNC-1\nNC\n1\n18\n\n\n-0.7960522\n-0.5792768\n-0.0369684\n0.6621526\nNC-2\nNC\n2\n18\n\n\n-1.6822927\n0.1041656\n0.0634251\n0.4337240\nNC-3\nNC\n3\n18\n\n\n-1.3157478\n-0.0835664\n-0.1246253\n0.5599467\nNC-1\nNC\n1\n20\n\n\n-1.7425068\n0.3029227\n-0.0161466\n0.5134360\nNC-2\nNC\n2\n20\n\n\n-1.3970678\n-0.2923056\n0.4324586\n0.4765460\nNC-3\nNC\n3\n20\n\n\n-1.0777451\n-0.1232925\n0.2388682\n0.7585307\nNC-1\nNC\n1\n22\n\n\n0.4851039\n-4.1291445\n-4.0625050\n-0.4582436\nNC-2\nNC\n2\n22\n\n\n-1.0516226\n-0.7228479\n1.0641320\n0.4955951\nNC-3\nNC\n3\n22\n\n\n\n\n\n🎬 Plot PC1 against PC2 and colour by time and shape by treatment:\n\npca_labelled |> \n ggplot(aes(x = PC1, y = PC2, \n colour = factor(time_day),\n shape = treatment)) +\n geom_point(size = 3) +\n scale_colour_viridis_d(end = 0.95, begin = 0.15,\n name = \"Time\") +\n scale_shape_manual(values = c(17, 19),\n name = NULL) +\n theme_classic()\n\n\n\n\n\n\n\n🎬 Plot PC1 against PC2 and colour by time and facet treatment:\n\npca_labelled |> \n ggplot(aes(x = PC1, y = PC2, colour = factor(time_day))) +\n geom_point(size = 3) +\n scale_colour_viridis_d(end = 0.95, begin = 0.15,\n name = \"Time\") +\n facet_wrap(~treatment, ncol = 1) +\n theme_classic()\n\n\n\n\n\n\n\nreplicates are similar at the same time and treatment especially early as we might expect. PC is essentially an axis of time.\n\nWe are going to create an interactive heatmap with the heatmaply (Galili et al. 2017) package. heatmaply takes a matrix as input so we can use mat\n🎬 Set the rownames to the sample id whihcih is combination of sample_replicate and time_day:\n\nrownames(mat) <- interaction(vfa_cummul_pca$sample_replicate, \n vfa_cummul_pca$time_day)\n\nYou might want to view the matrix by clicking on it in the environment pane.\n🎬 Load the heatmaply package:\n\nlibrary(heatmaply)\n\nWe need to tell the clustering algorithm how many clusters to create. We will set the number of clusters for the treatments to be 2 and the number of clusters for the vfa to be the same since it makes sense to see what clusters of genes correlate with the treatments.\n🎬 Set the number of clusters for the treatments and vfa:\n\nn_treatment_clusters <- 2\nn_vfa_clusters <- 2\n\n🎬 Create the heatmap:\n\nheatmaply(mat, \n scale = \"column\",\n k_col = n_vfa_clusters,\n k_row = n_treatment_clusters,\n fontsize_row = 7, fontsize_col = 10,\n labCol = colnames(mat),\n labRow = rownames(mat),\n heatmap_layers = theme(axis.line = element_blank()))\n\n\n\n\n\nThe heatmap will open in the viewer pane (rather than the plot pane) because it is html. You can “Show in a new window” to see it in a larger format. You can also zoom in and out and pan around the heatmap and download it as a png. You might feel the colour bars is not adding much to the plot. You can remove it by setting hide_colorbar = TRUE, in the heatmaply() function.\nOne of the NC replicates at time = 22 is very different from the other replicates. The CN10 treatments cluster together at high time points. CN10 samples are more similar to NC samples early on. Most of the VFAs behave similarly with highest values later in the experiment for CN10 but isohexanoate and hexanoate differ. The difference might be because isohexanoate is especially low in the NC replicates at time = 1 and hexanoate is especially high in the NC replicate 2 at time = 22\n\nCalculate the flux(change in VFA concentration over a period of time, divided by weight or volume of material) of each VFA, by mM and by weight.\nI’ve requested clarification: for the flux measurements, do they need graphs of the rate of change wrt time? And is the sludge volume going to be a constant for all samples or something they measure and varies by vial?\n\nGraph and extract the reaction rate assuming a first order chemical/biological reaction and an exponential falloff rate\nI’ve requested clarification: for the nonlinear least squares curve fitting, I assume x is time but I’m not clear what the Y variable is - concentration? or change in concentration? or rate of change of concentration?\nPages made with R (R Core Team 2023), Quarto (Allaire et al. 2022), knitr (Xie 2022), kableExtra (Zhu 2021)", "crumbs": [ "Omics", "Kelly's Project", @@ -2552,7 +2552,7 @@ "href": "omics/kelly/workshop.html#recalculate-the-data-into-grams-per-litre", "title": "Workflow for VFA analysis", "section": "", - "text": "To make conversions from mM to g/l we need to do mM * 0.001 * MW. We will import the molecular weight data, pivot the VFA data to long format and join the molecular weight data to the VFA data. Then we can calculate the g/l. We will do this for both the cumulative and delta dataframes.\n🎬 import molecular weight data\n\nmol_wt <- read_table(\"data-raw/mol_wt.txt\") |>\n mutate(vfa = tolower(vfa))\n\n🎬 Pivot the cumulative data to long format:\n\nvfa_cummul <- vfa_cummul |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_cummul to check you understand what you have done.\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nvfa_cummul <- vfa_cummul |> \n left_join(mol_wt, by = \"vfa\") |>\n mutate(conc_g_l = conc_mM * 0.001 * mw)\n\nView vfa_cummul to check you understand what you have done.\nRepeat for the delta data.\n🎬 Pivot the change data, delta_vfa to long format:\n\nvfa_delta <- vfa_delta |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_delta to check it looks like vfa_cummul\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):", + "text": "To make conversions from mM to g/l we need to do mM * 0.001 * MW. We will import the molecular weight data, pivot the VFA data to long format and join the molecular weight data to the VFA data. Then we can calculate the g/l. We will do this for both the cumulative and delta dataframes.\n🎬 import molecular weight data\n\nmol_wt <- read_table(\"data-raw/mol_wt.txt\") |>\n mutate(vfa = tolower(vfa))\n\n🎬 Pivot the cumulative data to long format:\n\nvfa_cummul <- vfa_cummul |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_cummul to check you understand what you have done.\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nvfa_cummul <- vfa_cummul |> \n left_join(mol_wt, by = \"vfa\") |>\n mutate(conc_g_l = conc_mM * 0.001 * mw)\n\nView vfa_cummul to check you understand what you have done.\nRepeat for the delta data.\n🎬 Pivot the change data, delta_vfa to long format:\n\nvfa_delta <- vfa_delta |> \n pivot_longer(cols = -c(sample_replicate,\n treatment, \n replicate,\n time_day),\n values_to = \"conc_mM\",\n names_to = \"vfa\") \n\nView vfa_delta to check it looks like vfa_cummul\n🎬 Join molecular weight to data and calculate g/l (mutate to convert to g/l * 0.001 * MW):\n\nvfa_delta <- vfa_delta |> \n left_join(mol_wt, by = \"vfa\") |>\n mutate(conc_g_l = conc_mM * 0.001 * mw)", "crumbs": [ "Omics", "Kelly's Project", diff --git a/structures/structures.html b/structures/structures.html index 4641026..f10f732 100644 --- a/structures/structures.html +++ b/structures/structures.html @@ -217,7 +217,7 @@

    Structure Data Analysis for Group Project

    Published
    -

    28 March, 2024

    +

    29 March, 2024

    @@ -258,7 +258,7 @@

    Structure Data Analysis for Group Project

    I think that’s it! You can now download the RStudio project and run each chunk in the quarto document.

    There is an example RStudio project here: structure-analysis. You can also download the project as a zip file from there but there is some code that will do that automatically for you. Since this is an RStudio Project, do not run the code from inside a project. You may want to navigate to a particular directory or edit the destdir:

    -
    usethis::use_course(url = "3mmaRand/structure-analysis", destdir = ".")
    +
    usethis::use_course(url = "3mmaRand/structure-analysis", destdir = ".")

    You can agree to deleting the zip. You should find RStudio restarts and you have a new project called structure-analysis-xxxxxx. The xxxxxx is a commit reference - you do not need to worry about that, it is just a way to tell you which version of the repo you downloaded.

    You should be able to open the antibody_mimetics_workshop_3.qmd file and run each chunk. You can also knit the document to html.