count of ibd segments between all sampled pairs #1241
-
Hi all, Quick and simple question that me, a python layman, am struggling with: I'd like to count the number of genomic segments in which individual i and individual j in range(number of individuals) have the same MRCA. Goal here is to compare number of ibd segments between modern-modern individuals sampled across space and modern-ancient individuals across space and time. Thanks so much for your time and all the hard work you all have put into this package! Best, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi Zach! 👋 This is actually not a simple question at all - computing IBD segments efficiently is not easy, and we don't have any public APIs for doing this at the moment. Here's a simple brute-force approach though, that you might be able to adapt for your purposes: import itertools
import dataclasses
import msprime
@dataclasses.dataclass
class IbdSegment:
pair: (int, int)
mrca: int
left: float = 0
right: float = -1
ts = msprime.simulate(5, recombination_rate=0.01, length=40, random_seed=42)
print(ts.draw_text())
pairs = list(itertools.combinations(ts.samples(), 2))
tree = ts.first()
mrca_map = {pair: IbdSegment(pair, tree.mrca(*pair)) for pair in pairs}
ibd_segs = []
while tree.next():
for pair in pairs:
ibd = mrca_map[pair]
mrca = tree.mrca(*pair)
if mrca != ibd.mrca:
ibd.right = tree.interval[0]
ibd_segs.append(ibd)
# Start a new record
mrca_map[pair] = IbdSegment(pair, mrca, tree.interval[0])
# Flush the remaining IBD segments
for ibd in mrca_map.values():
ibd.right = ts.sequence_length
ibd_segs.append(ibd)
for seg in ibd_segs:
print(seg) This gives:
Is this what you were looking for? Note that this is quite a strong definition of IBD, in that the MRCAs must be identical, and this may not be what you want. |
Beta Was this translation helpful? Give feedback.
Hi Zach! 👋
This is actually not a simple question at all - computing IBD segments efficiently is not easy, and we don't have any public APIs for doing this at the moment. Here's a simple brute-force approach though, that you might be able to adapt for your purposes: