Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MRCA calculation based on Schieber-Vishkin algorithm #17

Open
jeromekelleher opened this issue Jul 5, 2022 · 1 comment
Open

Add MRCA calculation based on Schieber-Vishkin algorithm #17

jeromekelleher opened this issue Jul 5, 2022 · 1 comment

Comments

@jeromekelleher
Copy link
Owner

The Schieber-Vishkin algorithm allows us to compute the MRCA of two nodes u and v in constant time, given a O(n) preprocessing step and O(n) extra storage space. If someone is computing a lot of MRCAs, then it can be quite a bit speedup over just computing each one directly.

It would be nice to have an implementation of the mrca function that does something like this:

def mrca(ds, u, v):
    if "sv_tau" in ds:
          return _sv_mrca(ds, u, v)
    else:
          return _mrca(ds, u, v)

There's an implementation of the SV algorithm the tskit test suite here. This is a direct implementation of Knuth's version, which uses 1-based arrays instead of 0-based. I never got around to properly updating it to use 0-based, and just did a quick hack to get it converted. It would be nice to do this properly.

@benjeffery
Copy link
Collaborator

This is nice - and a great example of the extensibility of the dataset paradigm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants