Skip to content

Commit

Permalink
Adds a py prototype for two-locus branch stats
Browse files Browse the repository at this point in the history
Currently, this algorithm creates a matrix of LD, performing a pairwise
comparison of all trees in the tree sequence.

This implementation lacks windows/positions, sample sets and
polarisation. The outputs of the code produce results in units of branch
length, needing to be multiplied by mu^2 or divided by product of the
total branch length of the two trees.

This algorithm works by keeping a running sum of the statistic between
two trees, updating each time we encounter a branch addition or removal.
The tricky part is that we have to remove or add LD contributed by
samples that already existed or that will remain under a given node
after the addition/removal of branches.

We include a validation against the original formulation of this
problem, by including an implementation that was described in McVean
2002. The original formulation computing the covariance of tMRCAs of
2, 3, and 4 samples of individuals on the trees in question. This
implementation has several limitations 1) it is very slow and 2) it does
not work for trees that are decapitated, as certain samples do not have
MRCAs.
  • Loading branch information
lkirk committed Apr 23, 2024
1 parent e6483fc commit 1761e05
Showing 1 changed file with 509 additions and 20 deletions.
Loading

0 comments on commit 1761e05

Please sign in to comment.