Breaking Changes
- The definition of
TreeSequence.genetic_relatedness
and
TreeSequence.genetic_relatedness_weighted
are changed
to average over sample sets, rather than summing over them.
For computation with diploid sample sets, this will change the result
by a factor of four; for larger sample sets it will now produce
sensible values that are comparable between sample sets of different sizes.
The default for these methods is also changed topolarised=True
,
but the output is unchanged forcentre=True
(the default).
See the documentation for these methods for more discussion.
(@petrelharp, @mmosmond, #1623)
Bugfixes
-
Fix to
TreeSequence.genetic_relatedness
withindexes=None
and
proportion=True
. (@petrelharp, #2984, #1623) -
Fix to
TreeSequence.general_stat
when using non-strict summary functions
in the presence of non-ancestral material (very rare).
(@petrelharp, #2983, #1623) -
Printing
tskit.MetadataSchema(schema=None)
now shows"Null_schema"
rather
thanNone
, to avoid confusion (@hyanwong, #2720) -
Limit output HTML when a tree sequence is displayed that has a large amount of metadata.
(@benjeffery, #2999) -
Fix warning in
draw_svg
to use correct warnings module.
(@duncanMR, #2870, #2871)
Features
-
Add the
centre
option toTreeSequence.genetic_relatedness
and
TreeSequence.genetic_relatedness_weighted
.
(@petrelharp, @mmosmond, #1623) -
Edges now have an
.interval
attribute returning atskit.Interval
object.
(@hyanwong, #2531) -
Variants now have a
states()
method that returns the genotypes as an
(inefficient) array of strings, rather than integer indexes, to
aid comparison of genetic variation (@hyanwong, #2617) -
Added
distance_between
that calculates the total distance between two nodes in a tree.
(@Billyzhang1229, #2771) -
Added
genetic_relatedness_matrix
method to compute
pairwise genetic relatedness between sample sets.
(@jeromekelleher, @petrelharp, #2823) -
Add
TreeSequence.extend_haplotypes
method that extends ancestral haplotypes
using recombination information, leading to unary nodes in many trees and
fewer edges. (@petrelharp, @hfr1tz3, :user:nspope
,
@avabamf, #2651, #2938) -
Add
Table.drop_metadata
to make clearing metadata from tables easy.
(@jeromekelleher, #2944) -
Add
Interval.mid
andTree.mid
properties to return the midpoint of the interval.
(@currocam, #2960) -
Added
genetic_relatedness_vector
method to compute product of genetic relatedness
matrix and weight vector.
(@petrelharp, #2980) -
Added
pair_coalescence_counts
method to calculate coalescence events per node or time
interval,pair_coalescence_quantiles
method to estimate quantiles of pair
coalescence times using empirical CDF inversion, andpair_coalescence_rates
method to
estimate instantaneous rates of pair coalescence within time intervals from the empirical CDF.
(@nspope, #2915, #2976, #2985) -
Add provenance information to the HTML notebook representation of a tree sequence.
(@benjeffery, #3001) -
The
.draw_svg()
methods can add annotated genomic regions (e.g. genes) to the
x-axis. (@hyanwong, #3002) -
Added a
node_titles
and amutation_titles
parameter to.draw_svg()
methods
which assigns a string to node and mutation symbols, commonly shown on mouseover. This
can reduce label clutter while retaining useful info (@hyanwong, #3007) -
Added (currently undocumented) use of the
order
parameter inTree.draw_svg()
to
pass a subset of nodes, so subtrees can be visually collapsed. Additionally, an option
pack_untracked_polytomies
allows large polytomies involving untracked samples to
be summarised as a dotted line (@hyanwong, #3011 #3010, #3012) -
Added a
title
parameter to.draw_svg()
methods (@hyanwong, #3015) -
Add comma separation to all display numbers. (@benjeffery, #3017, #3018)
-
Add
resources
section to provenance schema. (@benjeffery, #3016) -
Add
Tree.rf_distance
method to calculate the unweighted Robinson-Foulds distance
between two trees. (@Billyzhang1229, #995, #2643, #3032)