Skip to content

Commit

Permalink
Merge branch 'main' into hill_diversity
Browse files Browse the repository at this point in the history
  • Loading branch information
grst committed Nov 1, 2024
2 parents 578d20b + 86e93ce commit 23cd44c
Show file tree
Hide file tree
Showing 67 changed files with 3,011 additions and 1,215 deletions.
5 changes: 2 additions & 3 deletions .conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,19 @@ build:

requirements:
host:
- python >=3.9
- python >=3.10
- hatchling
- hatch-vcs

run:
- python >=3.9
- python >=3.10
- anndata >=0.9
- awkward >=2.1.0
- mudata >=0.2.3
- scanpy >=1.9.3
- pandas >=1.5,!=2.1.2
- numpy >=1.17.0
- scipy
- parasail-python
- scikit-learn
- python-levenshtein
- python-igraph !=0.10.0,!=0.10.1
Expand Down
13 changes: 6 additions & 7 deletions .github/workflows/conda.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,27 +22,26 @@ jobs:
matrix:
include:
- os: ubuntu-latest
python: "3.9"
python: "3.11"

env:
OS: ${{ matrix.os }}
PYTHON: ${{ matrix.python }}

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3
with:
miniforge-variant: Mambaforge
miniforge-version: latest
mamba-version: "*"
channels: conda-forge,bioconda
channel-priority: strict
python-version: ${{ matrix.python-version }}
python-version: ${{ matrix.python }}

- name: install conda build
run: |
mamba install -y boa conda-verify
mamba install -y boa conda-verify python=${{ matrix.python }}
shell: bash

- name: build and test package
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/test-tutorials.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ jobs:
tutorial:
- tutorial_3k_tcr.ipynb
- tutorial_io.ipynb
- tutorial_5k_bcr.ipynb
os:
- ubuntu-latest
python:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:
matrix:
include:
- os: ubuntu-latest
python: "3.9"
python: "3.10"
- os: ubuntu-latest
python: "3.12"
- os: ubuntu-latest
Expand Down Expand Up @@ -52,7 +52,7 @@ jobs:
python -m pip install --upgrade pip wheel
- name: Install dependencies
run: |
pip install ${{ matrix.pip-flags }} ".[dev,test,rpack,dandelion,diversity]"
pip install ${{ matrix.pip-flags }} ".[dev,test]"
- name: Test
env:
MPLBACKEND: agg
Expand Down
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ fail_fast: false
default_language_version:
python: python3
default_stages:
- commit
- push
- pre-commit
- pre-push
minimum_pre_commit_version: 2.16.0
repos:
- repo: https://github.com/pre-commit/mirrors-prettier
Expand All @@ -12,15 +12,15 @@ repos:
- id: prettier
exclude: '^\.conda'
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.5.7
rev: v0.7.1
hooks:
- id: ruff
types_or: [python, pyi, jupyter]
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format
types_or: [python, pyi, jupyter]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
rev: v5.0.0
hooks:
- id: detect-private-key
- id: check-ast
Expand Down
42 changes: 39 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,45 @@ and this project adheres to [Semantic Versioning][].

## [Unreleased]

### Addition
### Documentation

- Add a tutorial for BCR analysis with Scirpy ([#542](https://github.com/scverse/scirpy/pull/542)).

## v0.19.0

### Additions

- Add a `mask_obs` argument to `tl.clonotype_network` that allows to compute the clonotype networks on a subset of the cells ([#557](https://github.com/scverse/scirpy/pull/557)).
- Add `datasets.stephenson2021_5k`, an example dataset for the upcoming BCR tutorial ([#565](https://github.com/scverse/scirpy/pull/565))

### Fixes

- Add all optional dependencies required for testing to the `[test]` dependency group ([#562](https://github.com/scverse/scirpy/pull/562)).
- Unpin AnnData version ([#551](https://github.com/scverse/scirpy/pull/551))

## v0.18.0

### Additions

- Isotypically included B cells are now labelled as `receptor_subtype="IGH+IGK/L"` instead of `ambiguous` in `tl.chain_qc` ([#537](https://github.com/scverse/scirpy/pull/537)).
- Added the `normalized_hamming` metric to `pp.ir_dist` that accounts for differences in CDR3 sequence length ([#512](https://github.com/scverse/scirpy/pull/512)).
- `tl.define_clonotype_clusters` now has an option to require J genes to match (`same_j_gene=True`) in addition to `same_v_gene`. ([#470](https://github.com/scverse/scirpy/pull/470)).

### Performance improvements

- The hamming distance has been reimplemented with numba, achieving a significant speedup ([#512](https://github.com/scverse/scirpy/pull/512)).
- Clonotype clustering has been accelerated leveraging sparse matrix operations ([#470](https://github.com/scverse/scirpy/pull/470)).

### Fixes

- Fix that `pl.clonotype_network` couldn't use non-standard obsm key ([#545](https://github.com/scverse/scirpy/pull/545)).

### Other changes

- Isotypically included B cells are now labelled as `receptor_subtype="IGH+IGK/L"` instead of `ambiguous` in `tl.chain_qc`. ([#537](https://github.com/scverse/scirpy/pull/537))
- Make `parasail` an optional dependency since it is hard to install it on ARM CPUs. `TCRdist` is now the
recommended default distance metric which is much faster than parasail-based pairwise sequence alignments while
providing very similar results ([#547](https://github.com/scverse/scirpy/pull/547)).
- Drop support for Python 3.9 in accordance with [SPEC0](https://scientific-python.org/specs/spec-0000/) ([#546](https://github.com/scverse/scirpy/pull/546))

## v0.17.2

Expand All @@ -38,7 +74,7 @@ and this project adheres to [Semantic Versioning][].

### Fixes

- Fix issue with detecting the number of available CPUs on MacOD ([#518](https://github.com/scverse/scirpy/pull/502))
- Fix issue with detecting the number of available CPUs on MacOS ([#518](https://github.com/scverse/scirpy/pull/502))

## v0.16.1

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Please refer to the [documentation][link-docs]. In particular, the

## Installation

You need to have Python 3.9 or newer installed on your system. If you don't have
You need to have Python 3.10 or newer installed on your system. If you don't have
Python installed, we recommend installing [Mambaforge](https://github.com/conda-forge/miniforge#mambaforge).

There are several alternative options to install scirpy:
Expand Down
1 change: 1 addition & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,7 @@ Example datasets
datasets.wu2020
datasets.wu2020_3k
datasets.maynard2020
datasets.stephenson2021_5k

Reference databases
^^^^^^^^^^^^^^^^^^^
Expand Down
34 changes: 30 additions & 4 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,16 @@ Glossary
:term:`CDR3<CDR>` nucleotide sequences, but might recognize the same antigen
because they have the same or similar CDR3 amino acid sequence.

This is especially relevant for BCR, because clonally related cell are likely to differ due to
:term:`somatic hypermutation <SHM>`. It is important to understand that there is currently no best practice or
go-to approach on how to define clonotype cluster for BCR, as it remains an active research
field (:cite:`Yaari.2015`). There exist many different approaches such as maximum-likelihood (:cite:`Ralph.2016`),
hierarchical clustering (:cite:`Gupta.2017`), spectral clustering (:cite:`Nouri.2018`), natural language
processing (:cite:`Lindenbaum.2021`) and network based approaches (:cite:`BashfordRogers.2013`). A recent
comparison study indicates that computationally more sophisticated clonal inference approaches do not
outperform simplistic, computational cheaper ones (:cite:`Balashova.2024`). That said, there is still a
need for more in-depth comparison studies to confirm these results.

See also: :func:`scirpy.tl.define_clonotype_clusters`.

Private clonotype
Expand Down Expand Up @@ -190,7 +200,7 @@ Glossary
Immune receptor.

BCR
B-cell receptor. A BCR consiste of two Immunoglobulin (IG) heavy chains and
B-cell receptor. A BCR consists of two Immunoglobulin (IG) heavy chains and
two IG light chains. The two light chains contain a variable region, which is
responsible for antigen recognition.

Expand All @@ -201,12 +211,24 @@ Glossary
under the `CC BY-4.0 <https://creativecommons.org/licenses/by/4.0/deed.en>`__ license,
obtained from `wikimedia commons <https://commons.wikimedia.org/w/index.php?curid=49935883>`__

SHM
Common abbreviation for "Somatic hypermutation". This process is unique to BCR and occurs as part
of affinity maturation upon antigen encounter. This process further increases the diversity of the
variable domain of the BCR and selects for cells with higher affinity. SHM introduces around one point mutation per 1000
base pairs (:cite:`Kleinstein.2003`) and is able to introduce (although rare) deletions and/or insertions (:cite:`Wilson.1998`).
Furthermore, SHM is not a stochastic process, but biased in multiple ways (e.g. intrinsic hot-spot motifs (reviewed in :cite:`Schramm.2018`))

Dual IR
:term:`IRs<IR>` with more than one pair of :term:`VJ<V(D)J>` and
:term:`VDJ<V(D)J>` sequences. While this was
previously thought to be impossible due to the mechanism of allelic exclusion
(:cite:`Brady2010-gh`), there is an increasing amound of evidence for a *bona fide*
dual-IR population (:cite:`Schuldt2019`, :cite:`Ji2010-bn`, :cite:`Vettermann2010`).
(:cite:`Brady2010-gh`), there is an increasing amount of evidence for a *bona fide*
dual-IR population (:cite:`Schuldt2019`, :cite:`Shi.2019`, :cite:`RobertaPelanda.2014`,
:cite:`Ji2010-bn`, :cite:`Vettermann2010`).

Recent evidence suggest that also B cells with three or more productively rearranged
H and/or L chains exist (:cite:`Zhu.2023`), which indicates how much of B cell development
is still unclear.

For more information on how *Scirpy* handles dual IRs, see the
page about our :ref:`IR model<receptor-model>`.
Expand Down Expand Up @@ -239,8 +261,12 @@ Glossary
Alellically included B-cells
A B cell with two pairs of :term:`IG` chains. See :term:`Dual IR`.

Isotypically included B-cells
Similar to :term:`Alellically included B-cells`, but expresses both IGL and
IGK and thus rearrangements are not on alleles of the same gene (= isotypic inclusion).

Clonotype modularity
The clonotype modularity measures how densly connected the transcriptomics
The clonotype modularity measures how densely connected the transcriptomics
neighborhood graph underlying the cells in a clonotype is. Clonotypes with
a high modularity consist of cells that are transcriptionally more similar
than that of a clonotype with a low modularity.
Expand Down
Loading

0 comments on commit 23cd44c

Please sign in to comment.