`ts.pair_coalescence_rates` can give large or infinite rates #3048

hyanwong · 2024-11-06T09:17:48Z

Here's an example giving a maximum value of infinity in the returned matrix:

L = 10e6  # 10 Mb
params = {"sequence_length": L, "population_size":1e4, "recombination_rate": 1e-8, "random_seed": 6} 
sweep_model = msprime.SweepGenicSelection(
    position=L/2, start_frequency=0.0001, end_frequency=0.9999, s=0.25, dt=1e-6)
ts = msprime.sim_ancestry(20, model=[sweep_model, msprime.StandardCoalescent()], **params)
time_intervals = np.logspace(0, np.log10(ts.max_time), 20)
# time intervals need to go from 0 -> infinity
time_intervals = np.concatenate(([0], time_intervals, [np.inf]))
genome_intervals = np.linspace(0, L, 21)

rates = ts.pair_coalescence_rates(time_windows=time_intervals, windows=genome_intervals)
np.nanmax(rates)  # Gives np.inf

If you swap the random_seed to 3, you get a maximum rate of 134217728.0, which also seems wrong. These large rates are all in the last timeslice. Presumably we are hitting a division-by-zero or division-by-epsilon error.

The text was updated successfully, but these errors were encountered:

hyanwong · 2024-11-06T09:18:57Z

As discussed previously @nspope. I assigned you to this: I hope that's OK.

nspope · 2024-11-06T16:22:45Z

The issue is that the last time bin starts with the oldest node in the ts. The last bin has a single node in this case, and the corresponding rate estimator is effectively 1/(E[pair time | events > epoch start] - epoch start) , which blows up as only FP error keeps the denominator away from 0. This should be easy enough to catch in practice.

Changing time_intervals = np.logspace(0, np.log10(ts.max_time), 20) to time_intervals = np.logspace(0, np.log10(ts.max_time + 1), 20) gives something much more reasonable. Starting the last epoch at a fixed time just before the final node gives rate estimates that are bogus (e.g. totally dependent on the choice of fixed timepoint), but at least numerically stable.

hyanwong assigned nspope Nov 6, 2024

nspope added the bug Something isn't working label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ts.pair_coalescence_rates` can give large or infinite rates #3048

`ts.pair_coalescence_rates` can give large or infinite rates #3048

hyanwong commented Nov 6, 2024

hyanwong commented Nov 6, 2024

nspope commented Nov 6, 2024 •

edited

Loading

ts.pair_coalescence_rates can give large or infinite rates #3048

ts.pair_coalescence_rates can give large or infinite rates #3048

Comments

hyanwong commented Nov 6, 2024

hyanwong commented Nov 6, 2024

nspope commented Nov 6, 2024 • edited Loading

`ts.pair_coalescence_rates` can give large or infinite rates #3048

`ts.pair_coalescence_rates` can give large or infinite rates #3048

nspope commented Nov 6, 2024 •

edited

Loading