You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned here by @petrelharp during the review of #2805, we'd like a better treatment of missing data. As implemented, we compute $w_{AB}$, $w_{Ab}$, $w_{aB}$, but use the total number of samples in the tree sequence as $n$. If there's missing data, $n$ will not be correct. We should implement $n$ as $n=w_{AB}+w_{Ab}+w_{aB}+w_{ab}$ so that we can properly account for missing data. This means that $n$ will be the minimum number of samples intersecting with the sample set at the left locus and the right locus.
This will require a bit of restructuring because we will either need to intersect all samples with the samples of the current valid tree on the left and right or we'll want to seed the algorithm that propagates sample bit arrays across alleles.
The text was updated successfully, but these errors were encountered:
As mentioned here by @petrelharp during the review of #2805, we'd like a better treatment of missing data. As implemented, we compute$w_{AB}$ , $w_{Ab}$ , $w_{aB}$ , but use the total number of samples in the tree sequence as $n$ . If there's missing data, $n$ will not be correct. We should implement $n$ as $n=w_{AB}+w_{Ab}+w_{aB}+w_{ab}$ so that we can properly account for missing data. This means that $n$ will be the minimum number of samples intersecting with the sample set at the left locus and the right locus.
This will require a bit of restructuring because we will either need to intersect all samples with the samples of the current valid tree on the left and right or we'll want to seed the algorithm that propagates sample bit arrays across alleles.
The text was updated successfully, but these errors were encountered: