Skip to content

Commit

Permalink
Merge pull request #4 from csc-training/IonTorrentJYU23
Browse files Browse the repository at this point in the history
Minor changes to Ion Torrent exercises
  • Loading branch information
helijuottonen committed Aug 16, 2023
2 parents 25f19c7 + afd3709 commit e2631f0
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 7 deletions.
2 changes: 1 addition & 1 deletion docs/IonTorrent/Exercises_day1.html
Original file line number Diff line number Diff line change
Expand Up @@ -1006,7 +1006,7 @@ <h1><strong>Day 1: Data pre-processing</strong></h1>
1) they aligned outside the common alignment range, or
2) they contained too long homopolymers?</code></pre>
<p><strong>Step 11. Remove gaps and overhangs from the alignment. If
this creates new identical sequences, remove them</strong></p>
this creates new identical sequences, they will be removed</strong></p>
<p>Choose <code>screened.fasta.gz</code> and
<code>screened.count_table</code> and run the tool
<code>Filter sequence alignment</code>.</p>
Expand Down
12 changes: 9 additions & 3 deletions docs/IonTorrent/Exercises_day2.html
Original file line number Diff line number Diff line change
Expand Up @@ -849,7 +849,10 @@ <h3><strong>Getting the data into <code>phyloseq</code></strong></h3>
files</strong></p>
<p>Choose <code>chimeras.removed.fasta.gz</code>,
<code>chimeras.removed.count_table</code> and
<code>sequences-taxonomy-assignment.txt</code>. Next, run the tool
<code>sequences-taxonomy-assignment.txt</code>. Check in
<em>Parameters</em> that these files are in the correct locations under
<em>Input files</em> and correct if needed.<br />
Next, run the tool
<code>Microbial amplicon dta preprocessing for OTU / Generate input files for phyloseq</code>
so that you select the correct data type (<code>16S or 18S</code>) and
set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence
Expand Down Expand Up @@ -940,8 +943,11 @@ <h3><strong>Tidying and inspecting the data</strong></h3>
<li>Proportional prevalence filtering (for removing OTUs that occur in
less than specific % of samples)</li>
</ul>
<p>Selecting <code>ps_ind.Rda</code>, run the former tool, making sure
that both singletons and doubletons are removed.</p>
<p>Selecting <code>ps_ind.Rda</code>, run the tool
<code>Remove OTUs with 0-2 occurrences</code>, making sure that both
singletons and doubletons are removed. Feel free to test the prevalence
filtering tool too if you have time, but it is not necessary for the
following exercises.</p>
<pre><code>Why would we want to remove singletons and doubletons from the data?
Can you think of situations where these should be kept as part of the dataset?</code></pre>
<p><strong>Step 20. Sequence numbers, rarefaction curve and alpha
Expand Down
2 changes: 1 addition & 1 deletion eLena_md/IonTorrent/Exercises_IonTorrent_day1.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ Were these sequences removed because:
2) they contained too long homopolymers?
```

**Step 11. Remove gaps and overhangs from the alignment. If this creates new identical sequences, remove them**
**Step 11. Remove gaps and overhangs from the alignment. If this creates new identical sequences, they will be removed**

Choose `screened.fasta.gz` and `screened.count_table` and run the tool `Filter sequence alignment`.

Expand Down
4 changes: 2 additions & 2 deletions eLena_md/IonTorrent/Exercises_IonTorrent_day2.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ opts_knit$set(width=75)

**Step 16. Creating `phyloseq` input files**

Choose `chimeras.removed.fasta.gz`, `chimeras.removed.count_table` and `sequences-taxonomy-assignment.txt`.
Choose `chimeras.removed.fasta.gz`, `chimeras.removed.count_table` and `sequences-taxonomy-assignment.txt`. Check in *Parameters* that these files are in the correct locations under *Input files* and correct if needed.
Next, run the tool `Microbial amplicon dta preprocessing for OTU / Generate input files for phyloseq` so that you select the correct data type (`16S or 18S`) and set a cut-off of 0.03 (i.e. 3%, corresponding to 97% sequence similarity) for OTU clustering.

```
Expand Down Expand Up @@ -93,7 +93,7 @@ iv) There are two further tools for data tidying:
- Remove OTUs with 0-2 occurrences
- Proportional prevalence filtering (for removing OTUs that occur in less than specific % of samples)

Selecting `ps_ind.Rda`, run the former tool, making sure that both singletons and doubletons are removed.
Selecting `ps_ind.Rda`, run the tool `Remove OTUs with 0-2 occurrences`, making sure that both singletons and doubletons are removed. Feel free to test the prevalence filtering tool too if you have time, but it is not necessary for the following exercises.

```
Why would we want to remove singletons and doubletons from the data?
Expand Down

0 comments on commit e2631f0

Please sign in to comment.