Merge pull request #7 from csc-training/IonTorrentJYU23

Update tool names
csc-training · Sep 12, 2023 · 086f7cc · 086f7cc
2 parents b3f9973 + b5d08b8
commit 086f7cc
Show file tree

Hide file tree

Showing 4 changed files with 32 additions and 12 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1 @@
+**/.DS_Store
diff --git a/docs/IonTorrent/Exercises_day2.html b/docs/IonTorrent/Exercises_day2.html
@@ -853,7 +853,7 @@ <h3><strong>Getting the data into <code>phyloseq</code></strong></h3>
 <em>Parameters</em> that these files are in the correct locations under
 <em>Input files</em> and correct if needed.<br />
 Next, run the tool
-<code>Microbial amplicon dta preprocessing for OTU / Generate input files for phyloseq</code>
+<code>Microbial amplicon dta preprocessing for OTU / Cluster sequences to OTUs and classify them</code>
 so that you select the correct data type (<code>16S or 18S</code>) and
 set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence
 similarity) for OTU clustering.</p>
@@ -869,7 +869,7 @@ <h3><strong>Getting the data into <code>phyloseq</code></strong></h3>
 <em>Selected files</em> choose the tab <em>Phenodata</em>.</p></li>
 <li><p>In the column <code>description</code>, write the sample names as
 you want them to appear in result plots. For example, <em>HPc1</em>,
-<em>HPc2</em> etc.</p></li>
+<em>HPc2</em> etc. The names must be unique for each sample.</p></li>
 <li><p>Create new columns called <code>site</code> and
 <code>bagging</code> by clicking <code>+ Add column</code>. You delete
 the column <code>chiptype</code> or adjust the width of the columns to
@@ -895,8 +895,8 @@ <h3><strong>Getting the data into <code>phyloseq</code></strong></h3>
 constaxonomy file <code>file.opti_mcc.0.03.cons.taxonomy</code>. Select
 the tool <code>Convert Mothur files into phyloseq object</code>. In
 <em>Parameters</em>, specify the phenodata column including unique IDs
-for each community profile (the column <code>sample</code>). Run the
-tool.</p>
+for each community profile (the column <code>description</code>). Run
+the tool.</p>
 <p>This tool produces two files:</p>
 <ul>
 <li>a <code>phyloseq</code> object (stored as <code>ps.Rda</code>)</li>
@@ -932,10 +932,10 @@ <h3><strong>Tidying and inspecting the data</strong></h3>
 get an overview of the distribution of OTUs in our data.</p>
 <ol start="3" style="list-style-type: lower-roman">
 <li>Selecting <code>ps_ind.Rda</code>, run the
-<code>Additional prevalence summaries</code> tool. This will produce
-both a prevalence plot (<code>ps_prevalence.pdf</code>) and a text
-summary (<code>ps_low.txt</code>). The plot has a prevalence threshold
-of 5% drawn as a default guess for prevalence filtering.</li>
+<code>Prevalence summaries</code> tool. This will produce both a
+prevalence plot (<code>ps_prevalence.pdf</code>) and a text summary
+(<code>ps_low.txt</code>). The plot has a prevalence threshold of 5%
+drawn as a default guess for prevalence filtering.</li>
 </ol>
 <pre><code>How many doubletons are there in the data set?
 Do you have an advance idea about what the term &quot;prevalence&quot; refers to?
@@ -996,7 +996,8 @@ <h3><strong>Taking a closer look at patterns</strong></h3>
 <li>1 in Relative abundance cut-off threshold (%) for excluding
 OTUs</li>
 <li>Class as the level of biological organisation</li>
-<li>site as the phenodata variable 1 for plot faceting</li>
+<li>site as the phenodata variable 1 for dividing the plot into
+subplots</li>
 </ul>
 <p>The result should look close to this (click on the thumbnail to
 expand the image):</p>

diff --git a/eLena_md/IonTorrent/Exercises_IonTorrent_day2.Rmd b/eLena_md/IonTorrent/Exercises_IonTorrent_day2.Rmd
@@ -27,7 +27,7 @@ opts_knit$set(width=75)
 **Step 16. Creating `phyloseq` input files**
 
 Choose `chimeras.removed.fasta.gz`, `chimeras.removed.count_table` and `sequences-taxonomy-assignment.txt`. Check in *Parameters* that these files are in the correct locations under *Input files* and correct if needed.  
-Next, run the tool `Microbial amplicon dta preprocessing for OTU / Generate input files for phyloseq` so that you select the correct data type (`16S or 18S`) and set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence similarity) for OTU clustering.
+Next, run the tool `Microbial amplicon dta preprocessing for OTU / Cluster sequences to OTUs and classify them` so that you select the correct data type (`16S or 18S`) and set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence similarity) for OTU clustering.
 
 ```
 Why are we using a dissimilarity threshold of 3%? 
@@ -82,7 +82,7 @@ sequences in a bacterial dataset - isn't that a little strange?
 
 There are a few more additional tools for data tidying. Let's first get an overview of the distribution of OTUs in our data. 
 
-iii) Selecting `ps_ind.Rda`, run the `Additional prevalence summaries` tool. This will produce both a prevalence plot (`ps_prevalence.pdf`) and a text summary (`ps_low.txt`). The plot has a prevalence threshold of 5% drawn as a default guess for prevalence filtering.
+iii) Selecting `ps_ind.Rda`, run the `Prevalence summaries` tool. This will produce both a prevalence plot (`ps_prevalence.pdf`) and a text summary (`ps_low.txt`). The plot has a prevalence threshold of 5% drawn as a default guess for prevalence filtering.
 
 ```
 How many doubletons are there in the data set?
@@ -135,7 +135,7 @@ This will produce a file called `ps_relabund.Rda`. Select it and run the `OTU re
 
 - 1 in Relative abundance cut-off threshold (%) for excluding OTUs
 - Class as the level of biological organisation
-- site as the phenodata variable 1 for plot faceting
+- site as the phenodata variable 1 for dividing the plot into subplots
 
 The result should look close to this (click on the thumbnail to expand the image):
 

diff --git a/eLena_md/updating_instructions.txt b/eLena_md/updating_instructions.txt
@@ -0,0 +1,18 @@
+Updating the chipster-microbial repo (CSC internal instructions):
+
+1. Access:
+
+Ask an owner for access to the csc-training repo if you are not already a member.
+
+2. Structure of the repo: 
+
+eLena_md: R markdown files for the exercises etc. 
+	exercises-d1, exercises-d2: MiSeq 16S materials 
+	IonTorrent: Ion Torrent materials
+docs: html versions of the same markdown files (can be linked to course participants) 
+README.md: determined what is shown on the front page
+
+3. Editing: 
+
+Because the markdown files need to be converted to html, the easiest way is to edit the markdown files on your own computer, convert into html and then upload both files to Github (markdown in the folder eLena_md, html in the folder docs). So start with git clone on your own terminal. Create a new branch, make the changes in the markdown file (for example in RStudio), use 'knit' to convert into html, move the html file to the docs folder. Add, commit, push, create a pull request, merge.
+