Skip to content

Commit

Permalink
minor edits
Browse files Browse the repository at this point in the history
  • Loading branch information
jkanche committed Jul 22, 2024
1 parent bc05059 commit bd48297
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 18 deletions.
2 changes: 1 addition & 1 deletion notebook/annotate_cell_types.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -882,7 +882,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now's lets color the embedding with the cell types we identified from `celldex`. We ran the singleR algorithm on the full datasets, but scranpy filtered a few cells during the QC step. Lets identify which cells were kept."
"Now let's color the embedding with the cell types we identified from `celldex`. We ran the singleR algorithm on the full datasets, but scranpy filtered a few cells during the QC step. Let's identify which cells were kept."
]
},
{
Expand Down
14 changes: 7 additions & 7 deletions notebook/genomic_ranges.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -767,12 +767,12 @@
"\n",
"### 5.1 Compare exonic vs. intronic binding\n",
"\n",
"Let's first identify intron regions. There are two ways to find introns\n",
"Let's first identify intronic regions. There are two ways to find introns:\n",
"\n",
"1. **Find introns for each gene**, regions within each gene body that do not overlap to that gene's exons (using `psetdiff` in R/Bioconductor).\n",
"2. **Find introns globally**, regions that don't overlap with any exon (using `subtract`). To find these positions, we also ignore strand information.\n",
"1. **Find introns for each gene**, i.e. regions within each gene's transcript body that do not overlap any of that gene's exons (using `psetdiff` in R/Bioconductor).\n",
"2. **Find intronic regions globally**, i.e. regions that do not overlap with any exon (using `subtract`) for any gene. To find these positions, we ignore strand information, because there could be genes that overlap on different strands.\n",
"\n",
"We will find introns globally (2) for our tutorial today. If you are wondering why, we currently don't have `psetdiff` implemented in BiocPy/GenomicRanges. If you are interested in contributing, check out [this issue](https://github.com/BiocPy/GenomicRanges/issues/115).\n",
"We will find intronic regions globally (2) for our tutorial today.\n",
"\n",
"Let's first get all transcript ranges, following the steps in [Section 3.1](#find-transcription-start-sites-tss):"
]
Expand All @@ -783,15 +783,15 @@
"metadata": {},
"outputs": [],
"source": [
"# Get the full extent of each gene\n",
"# Get the full extent of each transcript\n",
"tx_ranges = by_tx.range().as_genomic_ranges()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now subtract any exons that overlaps within each transcript by ignoring the strand. The result is a `GenomicRangesList` containing intron regions for each transcript. We simplify this by coercing this into a `GenomicRanges` object."
"We now subtract any exons that overlaps within each transcript by ignoring the strand. The result is a `GenomicRangesList` containing intronic regions for each transcript. We simplify this by coercing this into a `GenomicRanges` object."
]
},
{
Expand Down Expand Up @@ -966,7 +966,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.3 Resizing and Shifting Peaks\n",
"### 5.3 Resize and Shift Peaks\n",
"\n",
"Resizing and shifting genomic ranges can be useful in various contexts. For example:\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/annotate_cell_types.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ sns.scatterplot(
```
:::

Now's lets color the embedding with the cell types we identified from `celldex`. We ran the singleR algorithm on the full datasets, but scranpy filtered a few cells during the QC step. Lets identify which cells were kept.
Now let's color the embedding with the cell types we identified from `celldex`. We ran the singleR algorithm on the full datasets, but scranpy filtered a few cells during the QC step. Let's identify which cells were kept.

::: {.panel-tabset}

Expand Down
18 changes: 9 additions & 9 deletions tutorials/genomic_ranges.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ Now, let's perform some basic operations like finding transcription start sites

Transcription Start Sites (TSS) are the locations where transcription of a gene begins. Identifying TSS is crucial for understanding gene regulation, as many regulatory elements are located near the TSS.

First, we use the `range()` method to get the full extent of each transcript. This should give us exactly one range per transcript.
First, we use the `range()` method to get the full extent of each transcript, i.e. from the start of the first exon to the end of the last exon. This should give us exactly one range per transcript.

::: {.panel-tabset}

Expand Down Expand Up @@ -294,7 +294,7 @@ print(peaks_by_promoters)

### 4.4 Find overlaps with exons

Lets find overlaps with any exon. We `unlist` our `GenomicRangesList` object to get all exon positions.
Let's find overlaps with any exon. We `unlist` our `GenomicRangesList` object to get all exon positions.

::: {.panel-tabset}

Expand Down Expand Up @@ -336,12 +336,12 @@ Let's explore some more complex operations that are often used in genomic analys

### 5.1 Compare exonic vs. intronic binding

Let's first identify intron regions. There are two ways to find introns
Let's first identify intronic regions. There are two ways to find introns:

1. **Find introns for each gene**, regions within each gene body that do not overlap to that gene's exons (using `psetdiff` in R/Bioconductor).
2. **Find introns globally**, regions that don't overlap with any exon (using `subtract`). To find these positions, we also ignore strand information.
1. **Find introns for each gene**, i.e. regions within each gene's transcript body that do not overlap any of that gene's exons (using `psetdiff` in R/Bioconductor).
2. **Find intronic regions globally**, i.e. regions that do not overlap with any exon (using `subtract`) for any gene. To find these positions, we ignore strand information, because there could be genes that overlap on different strands.

We will find introns globally (2) for our tutorial today. If you are wondering why, we currently don't have `psetdiff` implemented in BiocPy/GenomicRanges. If you are interested in contributing, check out [this issue](https://github.com/BiocPy/GenomicRanges/issues/115).
We will find intronic regions globally (2) for our tutorial today.

Let's first get all transcript ranges, following the steps in [Section 3.1](#find-transcription-start-sites-tss):

Expand All @@ -350,12 +350,12 @@ Let's first get all transcript ranges, following the steps in [Section 3.1](#fin
## Python

```{python}
# Get the full extent of each gene
# Get the full extent of each transcript
tx_ranges = by_tx.range().as_genomic_ranges()
```
:::

We now subtract any exons that overlaps within each transcript by ignoring the strand. The result is a `GenomicRangesList` containing intron regions for each transcript. We simplify this by coercing this into a `GenomicRanges` object.
We now subtract any exons that overlaps within each transcript by ignoring the strand. The result is a `GenomicRangesList` containing intronic regions for each transcript. We simplify this by coercing this into a `GenomicRanges` object.

::: {.panel-tabset}

Expand Down Expand Up @@ -429,7 +429,7 @@ peaks_with_first_exons = peaks_chr22.subset_by_overlaps(first_exons)
print(peaks_with_first_exons)
```

### 4.3 Resizing and Shifting Peaks
### 5.3 Resizing and Shifting Peaks

Resizing and shifting genomic ranges can be useful in various contexts. For example:

Expand Down

0 comments on commit bd48297

Please sign in to comment.