Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logoplots #534

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ requirements:
- numba >=0.41.0
- pooch >=1.7.0
- joblib >=1.3.1
- logomaker

test:
source_files:
Expand Down
2 changes: 1 addition & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ when calling the plotting function or need to be precomputed and stored in
pl.clonotype_modularity
pl.clonotype_network
pl.clonotype_imbalance

pl.logoplot_cdr3_motif


Base plotting functions: `pl.base`
Expand Down
80 changes: 79 additions & 1 deletion docs/tutorials/tutorial_5k_bcr.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ dependencies = [
'pooch>=1.7.0',
'pycairo>=1.20; sys_platform == "win32"',
'joblib>=1.3.1',
'logomaker'
]

[project.optional-dependencies]
Expand Down
1 change: 1 addition & 0 deletions src/scirpy/pl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from ._clonotypes import COLORMAP_EDGES, clonotype_network
from ._diversity import alpha_diversity
from ._group_abundance import group_abundance
from ._logoplots import logoplot_cdr3_motif
from ._repertoire_overlap import repertoire_overlap
from ._spectratype import spectratype
from ._vdj_usage import vdj_usage
Expand Down
91 changes: 91 additions & 0 deletions src/scirpy/pl/_logoplots.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
from collections.abc import Sequence
from typing import Literal

from logomaker import Logo, alignment_to_matrix

from scirpy.get import airr as get_airr
from scirpy.util import DataHandler


@DataHandler.inject_param_docs()
def logoplot_cdr3_motif(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the function to the API documentation: https://github.com/scverse/scirpy/blob/main/docs/api.rst

adata: DataHandler.TYPE,
chains: Literal["VJ_1", "VDJ_1", "VJ_2", "VDJ_2"] | Sequence[Literal["VJ_1", "VDJ_1", "VJ_2", "VDJ_2"]] = "VDJ_1",
airr_mod="airr",
airr_key="airr",
chain_idx_key="chain_indices",
cdr3_col: str = "junction_aa",
to_type: Sequence[Literal["information", "counts", "probability", "weight"]] = "information",
pseudocount: float = 0,
background=None,
center_weights: bool = False,
plot_default=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just set defaults and allow the user to override them via kwargs. This could be done via dict.update()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand it right, you want me to exclude them as part of the function call (like they are now) and define them in the function body with possibility to overwrite them as part of kwargs?

**kwargs,
):
"""
Generates logoplots of CDR3 sequences

This is a user friendly wrapper function around the logomaker python package.
Enables the analysis of potential amino acid motifs by displaying logo plots.
Subsetting of AnnData/MuData has to be performed manually beforehand (or while calling) and only cells with equal cdr3 sequence lengths are permitted.

Parameters
----------
{adata}
chains
One or up to two chains from which to use CDR3 sequences i.e. primary and/or secondary VJ/VDJ chains. Mixing VJ and VDJ chains will likely not lead to a meaningful result.
{airr_mod}
{airr_key}
{chain_idx_key}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typicalle allow the user to specify an ax object into which the plot is added. This allows the user to easily compose multi-panel plots, e.g.

fig, ax =plt.subplots()
ir.pl.something (..., ax=ax)

Do you think this is possible with logomaker?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
I will have a look on your other graphical implementations, but as far as I'm concerned (by looking at the logomaker documentation with working examples) this should be possible. So the idea is that the user can access the ax object after using the function right? So it has to be returned at some point in the function call, right?

cdr3_col
Key inside awkward array to retrieve junction information (should be in aa)
to_type
Choose one of matrix types as defined by logomaker:
* `"information"`
* `"counts"`
* `"probability"`
* `"weight"`
pseudocount
Pseudocount to use when converting from counts to probabilities
background
Background probabilities. Both arrays with the same length as ouput or df with same shape as ouput are permitted.
center_weights
Whether to subtract the mean of each row, but only if to_type == `weight`
plot_default
If true does some basic formatting for the user i.e:
* `"font_name"` = 'Arial Rounded MT Bold'
* `"color_scheme"` = 'chemistry'
* `"vpad"`= .05
* `"width"` = .9
And some additional styling. If false, the user needs to adapt`**kwargs` accordingly.
**kwargs
Additional arguments passed to logomaker.Logo() for comprehensive customization. For a full list of parameters please refer to logomaker documentation (https://logomaker.readthedocs.io/en/latest/implementation.html#logo-class)

Returns
-------
Returns a object of class logomaker.Logo (see here for more information https://logomaker.readthedocs.io/en/latest/implementation.html#matrix-functions)
"""
params = DataHandler(adata, airr_mod, airr_key, chain_idx_key)

Check warning on line 68 in src/scirpy/pl/_logoplots.py

View check run for this annotation

Codecov / codecov/patch

src/scirpy/pl/_logoplots.py#L68

Added line #L68 was not covered by tests
# make sure that sequences are prealigned i.e. they need to have the the same length
airr_df = get_airr(params, [cdr3_col], chains)
sequence_list = []
for chain in chains:
for sequence in airr_df[chain + "_" + cdr3_col]:
if sequence is not None:
sequence_list.append(sequence)

Check warning on line 75 in src/scirpy/pl/_logoplots.py

View check run for this annotation

Codecov / codecov/patch

src/scirpy/pl/_logoplots.py#L70-L75

Added lines #L70 - L75 were not covered by tests

motif = alignment_to_matrix(

Check warning on line 77 in src/scirpy/pl/_logoplots.py

View check run for this annotation

Codecov / codecov/patch

src/scirpy/pl/_logoplots.py#L77

Added line #L77 was not covered by tests
sequence_list, to_type=to_type, pseudocount=pseudocount, background=background, center_weights=center_weights
)
if plot_default:
cdr3_logo = Logo(

Check warning on line 81 in src/scirpy/pl/_logoplots.py

View check run for this annotation

Codecov / codecov/patch

src/scirpy/pl/_logoplots.py#L80-L81

Added lines #L80 - L81 were not covered by tests
motif, font_name="Arial Rounded MT Bold", color_scheme="chemistry", vpad=0.05, width=0.9, **kwargs
)

cdr3_logo.style_xticks(anchor=0, spacing=1, rotation=45)
cdr3_logo.ax.set_ylabel(f"{to_type}")
cdr3_logo.ax.set_xlim([-1, len(motif)])
return cdr3_logo

Check warning on line 88 in src/scirpy/pl/_logoplots.py

View check run for this annotation

Codecov / codecov/patch

src/scirpy/pl/_logoplots.py#L85-L88

Added lines #L85 - L88 were not covered by tests
else:
cdr3_logo = Logo(motif, **kwargs)
return cdr3_logo

Check warning on line 91 in src/scirpy/pl/_logoplots.py

View check run for this annotation

Codecov / codecov/patch

src/scirpy/pl/_logoplots.py#L90-L91

Added lines #L90 - L91 were not covered by tests
Loading