Understanding nodes with multiple parents #2942
-
Hello everyone, I wanted to ask a question to try to understand conceptually what it means when a node in the ARG has multiple parents. I understand that a node with two parents can represent a recombination event, but what does it mean for a node to have more than two parents? I have been using tsinfer+tsdate with a dataset of ~370 birds, and as we've been evaluating the tree sequences we've observed that many nodes have >2 parents (almost 1/3). I wanted to see if there can be a biological reason for this or if this indicates an issue with the inference. Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@madeline-chase this is because ARG encodes relationships between haplotypes of individuals and haplotypes of their ancestors - these ancestors can be immediate (as in sample(=child)-parent relationship where you would expect only 2 parental nodes in case of a recombination) or the ancestors can be much much older (as in sample-grandgrand…parent relationship where you would expect many parental nodes). The terminology is indeed confusing until you get used to it. Maybe this pre-print can help you understanding these aspects of ARGs https://www.biorxiv.org/content/10.1101/2023.11.03.565466v2 |
Beta Was this translation helpful? Give feedback.
Just to follow-up, @madeline-chase, as Gregor said, a "parent" in a tree sequence could represent an ancestor many generations ago. In addition, there could be several recombinant nodes stacked on top of each other, with no way of knowing which order the recombinations happened. In this case we can represent the genealogy by the child node having 3 or even more parents, as in the lower picture below. Of course, in any one local tree, a node can have only one parent, though