K optimization: r2 vs score

Hello,

I am running STITCH with a relatively small sample (40 individuals), with ~0.5X per-sample haplotagging (linked-read) sequencing. I recognize, that this sample size is pretty small for imputation, but I'm hoping STITCH will still be at least somewhat effective. I've run the program, varying K and number of generations as suggested, and am now evaluating the output. It seems that the mean score, and number of sites with scores > 0.4 increase as K increases from 2-35, where the values seem to asymptote. However, the r2 values reach their peak around K=14 (r2=0.875), and drop off on either side (K=35, r2=0.73).  Number of generations has minimal effect on r2, but runs with fewer generations (10-100) consistently yield more sites with high scores than those with more generations (300-1000). Does this make sense, and would you recommend maximizing score values or r2 values when selecting K? 

Thanks!
Nate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K optimization: r2 vs score #59

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

K optimization: r2 vs score #59

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions