-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial tree perf #5
Initial tree perf #5
Conversation
I have a start over here. Minor hiccup is needing a non-recursive postorder method for node iteration. Once this matures a bit, I'll get it over here. |
It would be nice to get all the code in here if we could - I was imagining a simple Makefile with a few .cc files what we could run directly in here? Is it possible to do something similar with Rust? |
Great stuff - I had a play around but couldn't get the numba code to go any faster, seems numba is already doing some of the tricks I tried and |
I think this is a good start. I think we may have a hard time matching this setup exactly in other languages, but we can see how it goes, I guess. |
Sounds good @molpopgen - what I'll do is write a C version of what the other languages should do, which should provide a reasonable template, dealing with the annoying stuff like decoding the genotypes etc. |
Sounds good. I'll try to concoct a C++ version of the Usher-like trees that isn't too hackish. |
One thing to add is that I think numba would perform closer if it had a non-recursive implementation as in tskit. |
Perhaps - we could have "numba-recursive" and "numba-loops"? Not too hard, if a bit fiddly to do. I'd be surprised if it makes much difference though, my earlier experience was that recursion was surprisingly good in numba. Go nuts if you want to try it out! |
Making a start on #2.
Puts in some infrastructure comparing the tskit version of hartigan parsimony with the numba one. The idea would be to add (maybe) a naive C++ version using pointers, a more sophisticated C++ version, and a Pythran version. There's probably a better way to display the data, but here's the initial version of the plot anyway:
We could probably do a bit more work on the numba version, I haven't made any real effort to profile it.
@molpopgen, how does this look? I guess what we'd want for the C++ versions is to take an executable that takes a tree sequence file as input (and a max_sites), and prints the mean and variance of the time taken to run the parsimony algorithm for the first max_sites to stdout. We probably don't need to bother with the variance tbh, but I thought I'd stick it in in case we ever needed it. We can then wrap them nicely in this script.