-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tree fixation using newick string and TreeSequence object #2276
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2276 +/- ##
=======================================
Coverage 91.30% 91.30%
=======================================
Files 20 20
Lines 11873 11873
Branches 2421 2421
=======================================
Hits 10841 10841
Misses 563 563
Partials 469 469
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This looks very cool @not-a-feature, thanks! I'll need a bit of time to digest, and I think it would probably help to have a call about it. @fbaumdicker - I assume you're involved here somewhere? |
Yes, I am. @not-a-feature and I are incorporating a) mutation models representing gene presence-absence evolution, b) fixed clonal backbone phylogenies (this PR), and c) simulating gene transfer instead of conversion. |
As discussed loading the common ancestor events from a Either use the It raises an error if it is used in combination with |
I was thinking more about providing this information as part of the initial state, rather than as something extrinsic. So, we provide the "backbone" of the simulation pre-done as some nodes and edges, and (importantly) these edges are not subsequently altered. Is this possible, or does that break the model? |
This PR will add an option (
--ce-from-nwk
) to provide a tree in newick format that will be respected during simulation (hudson model).The aim is to make sure that the given tree is present and that regular simulation (gene conversion and recombination) is still possible. The main idea is that each segment has an origin set (in the beginning only the id of the leaf), which is passed through and merged with the origins of the other segment at each coalescence event. When a regular event occurs, the algorithm checks if the selected segment is needed later, if so, the event is ignored.
Changes:
The following structural changes will be made:
General
Tuple[float, set, set]
). Each tuple represents the time and two lineages to coalesce.Segment class
Population class
get_emerged_from_lineage
that returns the indices of the lineages that have emerged from a given one.Simulator class
alloc_segment
andcopy_segment
to use theorigin
attribute.hudson_simulate
.self.coalescent_events[0][0] < self.t
)is_blocked_ancestor
method to check if a lineage id is used in a later fixed coalescence event.common_ancestor_event
Example:
The command
py algorithms.py --recombination-rate 0 --gene-conversion-rate 0 1 --sequence-length 10000 10 out --discrete --ce-events "((((A:0.005, B:0.005):0.01, (C:0.005, D:0.005):0.01):0.1,((E:0.005, F:0.005):0.01, (G:0.005, H:0.005):0.01):0.01):0.01, (I:0.005, J:0.005):0.01)"
will produce following tree:
Event though the first coalescence events (at Node 10-14) should happen at the same time (0.005), the events are slightly postponed as no two events can happen at the same time.
Todo / Open Tasks