Merge pull request #4 from csc-training/IonTorrentJYU23

Minor changes to Ion Torrent exercises
csc-training · Aug 16, 2023 · e2631f0 · e2631f0
2 parents 25f19c7 + afd3709
commit e2631f0
Show file tree

Hide file tree

Showing 4 changed files with 13 additions and 7 deletions.
diff --git a/docs/IonTorrent/Exercises_day1.html b/docs/IonTorrent/Exercises_day1.html
@@ -1006,7 +1006,7 @@ <h1><strong>Day 1: Data pre-processing</strong></h1>
 1) they aligned outside the common alignment range, or 
 2) they contained too long homopolymers?</code></pre>
 <p><strong>Step 11. Remove gaps and overhangs from the alignment. If
-this creates new identical sequences, remove them</strong></p>
+this creates new identical sequences, they will be removed</strong></p>
 <p>Choose <code>screened.fasta.gz</code> and
 <code>screened.count_table</code> and run the tool
 <code>Filter sequence alignment</code>.</p>

diff --git a/docs/IonTorrent/Exercises_day2.html b/docs/IonTorrent/Exercises_day2.html
@@ -849,7 +849,10 @@ <h3><strong>Getting the data into <code>phyloseq</code></strong></h3>
 files</strong></p>
 <p>Choose <code>chimeras.removed.fasta.gz</code>,
 <code>chimeras.removed.count_table</code> and
-<code>sequences-taxonomy-assignment.txt</code>. Next, run the tool
+<code>sequences-taxonomy-assignment.txt</code>. Check in
+<em>Parameters</em> that these files are in the correct locations under
+<em>Input files</em> and correct if needed.<br />
+Next, run the tool
 <code>Microbial amplicon dta preprocessing for OTU / Generate input files for phyloseq</code>
 so that you select the correct data type (<code>16S or 18S</code>) and
 set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence
@@ -940,8 +943,11 @@ <h3><strong>Tidying and inspecting the data</strong></h3>
 <li>Proportional prevalence filtering (for removing OTUs that occur in
 less than specific % of samples)</li>
 </ul>
-<p>Selecting <code>ps_ind.Rda</code>, run the former tool, making sure
-that both singletons and doubletons are removed.</p>
+<p>Selecting <code>ps_ind.Rda</code>, run the tool
+<code>Remove OTUs with 0-2 occurrences</code>, making sure that both
+singletons and doubletons are removed. Feel free to test the prevalence
+filtering tool too if you have time, but it is not necessary for the
+following exercises.</p>
 <pre><code>Why would we want to remove singletons and doubletons from the data?
 Can you think of situations where these should be kept as part of the dataset?</code></pre>
 <p><strong>Step 20. Sequence numbers, rarefaction curve and alpha

diff --git a/eLena_md/IonTorrent/Exercises_IonTorrent_day1.Rmd b/eLena_md/IonTorrent/Exercises_IonTorrent_day1.Rmd
@@ -168,7 +168,7 @@ Were these sequences removed because:
 2) they contained too long homopolymers?
 ```
 
-**Step 11. Remove gaps and overhangs from the alignment. If this creates new identical sequences, remove them**
+**Step 11. Remove gaps and overhangs from the alignment. If this creates new identical sequences, they will be removed**
 
 Choose `screened.fasta.gz` and `screened.count_table` and run the tool `Filter sequence alignment`.
 

diff --git a/eLena_md/IonTorrent/Exercises_IonTorrent_day2.Rmd b/eLena_md/IonTorrent/Exercises_IonTorrent_day2.Rmd
@@ -26,7 +26,7 @@ opts_knit$set(width=75)
 
 **Step 16. Creating `phyloseq` input files**
 
-Choose `chimeras.removed.fasta.gz`, `chimeras.removed.count_table` and `sequences-taxonomy-assignment.txt`. 
+Choose `chimeras.removed.fasta.gz`, `chimeras.removed.count_table` and `sequences-taxonomy-assignment.txt`. Check in *Parameters* that these files are in the correct locations under *Input files* and correct if needed.  
 Next, run the tool `Microbial amplicon dta preprocessing for OTU / Generate input files for phyloseq` so that you select the correct data type (`16S or 18S`) and set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence similarity) for OTU clustering.
 
 ```
@@ -93,7 +93,7 @@ iv) There are two further tools for data tidying:
 - Remove OTUs with 0-2 occurrences
 - Proportional prevalence filtering (for removing OTUs that occur in less than specific % of samples)
 
-Selecting `ps_ind.Rda`, run the former tool, making sure that both singletons and doubletons are removed.
+Selecting `ps_ind.Rda`, run the tool `Remove OTUs with 0-2 occurrences`, making sure that both singletons and doubletons are removed. Feel free to test the prevalence filtering tool too if you have time, but it is not necessary for the following exercises.
 
 ```
 Why would we want to remove singletons and doubletons from the data?