diff --git a/docs/tutorial.rst b/docs/tutorial.rst index cf6dc0551..4db660626 100644 --- a/docs/tutorial.rst +++ b/docs/tutorial.rst @@ -273,10 +273,14 @@ depending on presence of a single allele of a missense or truncating variant >>> from gpsea.analysis.clf import monoallelic_classifier >>> is_missense = variant_effect(VariantEffect.MISSENSE_VARIANT, tx_id) >>> truncating_effects = ( +... VariantEffect.TRANSCRIPT_ABLATION, +... VariantEffect.TRANSCRIPT_TRANSLOCATION, ... VariantEffect.FRAMESHIFT_VARIANT, +... VariantEffect.START_LOST, ... VariantEffect.STOP_GAINED, ... VariantEffect.SPLICE_DONOR_VARIANT, ... VariantEffect.SPLICE_ACCEPTOR_VARIANT, +... # more effects could be listed here ... ... ) >>> is_truncating = anyof(variant_effect(e, tx_id) for e in truncating_effects) >>> gt_clf = monoallelic_classifier( diff --git a/docs/user-guide/analyses/partitioning/genotype/variant_predicates.rst b/docs/user-guide/analyses/partitioning/genotype/variant_predicates.rst index 588d3da27..4fbca7dfe 100644 --- a/docs/user-guide/analyses/partitioning/genotype/variant_predicates.rst +++ b/docs/user-guide/analyses/partitioning/genotype/variant_predicates.rst @@ -26,13 +26,11 @@ The predicates operate on several lines of information: +------------------------+-------------------------------------------------------------------------------------------------+ | Protein data | variant is located in a region encoding a protein domain, protein feature type | +------------------------+-------------------------------------------------------------------------------------------------+ -| Genome | overlap with a genomic region of interest | -+------------------------+-------------------------------------------------------------------------------------------------+ The scope of the builtin predicates is fairly narrow and likely insufficient for real-life analyses. -However, the predicates can be chained into a compound predicate +However, several predicates can be "chained" into a compound predicate using a boolean logic, to achive more expressivity for testing complex conditions, such as "variant is a missense or synonymous variant located in exon 6 of `NM_013275.6`". @@ -41,8 +39,9 @@ such as "variant is a missense or synonymous variant located in exon 6 of `NM_01 Examples ******** -Here we show examples of several simple variant predicates and -how to chain them for testing complex conditions. +Here we show how to use the builtin predicates for simple tests +and how to build a compound predicate from the builtin predicates, +for testing complex conditions. Load cohort @@ -112,10 +111,10 @@ See the :mod:`gpsea.analysis.predicate` module for a complete list of the builtin predicates. -Predicate chain -=============== +Compound predicates +=================== -Using the builtin predicates, we can build a logical chain to test complex conditions. +A compound predicate for testing complex conditions can be built from two or more predicates. For instance, we can test if the variant meets any of several conditions: >>> import gpsea.analysis.predicate as vp @@ -130,7 +129,13 @@ or *all* conditions: >>> missense_and_exon20.test(variant) True -All variant predicates overload Python ``&`` (AND) and ``|`` (OR) operators, to allow chaining. +All variant predicates overload Python ``&`` (AND) and ``|`` (OR) operators, +to combine a predicate pair into a compound predicate. + +.. note:: + + Combining three or or more predicates can be achieved with :func:`~gpsea.analysis.allof` + and :func:`~gpsea.analysis.anyof` functions. Therefore, there is nothing that prevents us to combine the predicates into multi-level tests, e.g. to test if the variant is a *"chromosomal deletion" or a deletion which removes at least 50 bp*: diff --git a/docs/user-guide/analyses/phenotype-scores.rst b/docs/user-guide/analyses/phenotype-scores.rst index 85347ed96..5390e3ff0 100644 --- a/docs/user-guide/analyses/phenotype-scores.rst +++ b/docs/user-guide/analyses/phenotype-scores.rst @@ -121,9 +121,11 @@ In this example, the point mutation is a mutation that meets the following condi '((change length == 0 AND reference allele length == 1) AND MISSENSE_VARIANT on NM_001042681.2)' -For the loss of function predicate, the following variant effects are considered loss of function: +For the loss-of-function predicate, the following is a non-exhausting list +of variant effects considered as a loss-of-function: >>> lof_effects = ( +... VariantEffect.TRANSCRIPT_TRANSLOCATION, ... VariantEffect.TRANSCRIPT_ABLATION, ... VariantEffect.FRAMESHIFT_VARIANT, ... VariantEffect.START_LOST, @@ -131,7 +133,7 @@ For the loss of function predicate, the following variant effects are considered ... ) >>> lof_mutation = anyof(variant_effect(eff, tx_id) for eff in lof_effects) >>> lof_mutation.description -'(TRANSCRIPT_ABLATION on NM_001042681.2 OR FRAMESHIFT_VARIANT on NM_001042681.2 OR START_LOST on NM_001042681.2 OR STOP_GAINED on NM_001042681.2)' +'(TRANSCRIPT_TRANSLOCATION on NM_001042681.2 OR TRANSCRIPT_ABLATION on NM_001042681.2 OR FRAMESHIFT_VARIANT on NM_001042681.2 OR START_LOST on NM_001042681.2 OR STOP_GAINED on NM_001042681.2)' The genotype predicate will bin the patient into two classes: a point mutation or the loss of function: