Add parsimony_score function #27

jeromekelleher · 2022-08-02T10:34:16Z

We would like a function to compute the parsimony score for each site in a dataset.

The implementation can be based on the work in the tskit-paper repository where there is a number of different versions implemented using numba.

We would return a new dataset which includes the variable parsimony_score, which has a value for each site in the variant data.

Initially we can use a single site version, but I think we would want to use the vectorised version for better efficiency:

https://github.com/tskit-dev/tskit-paper/blob/main/tree_performance/benchmark.py#L147

Ultimately, we'd like to use Dask to run this vectorised version on chunks of sites, in parallel (although I'm not entirely clear how this would be structured)

The text was updated successfully, but these errors were encountered:

jeromekelleher changed the title ~~Add parsimony functions~~ Add parsimony_score function Aug 2, 2022

Billyzhang1229 mentioned this issue Aug 23, 2022

Parallel parsimony score calculation #35

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parsimony_score function #27

Add parsimony_score function #27

jeromekelleher commented Aug 2, 2022

Add parsimony_score function #27

Add parsimony_score function #27

Comments

jeromekelleher commented Aug 2, 2022