Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how ploidy works #95

Closed
hyanwong opened this issue Sep 19, 2023 · 3 comments
Closed

Document how ploidy works #95

hyanwong opened this issue Sep 19, 2023 · 3 comments

Comments

@hyanwong
Copy link
Member

hyanwong commented Sep 19, 2023

I assume that tstrait relies on (a) individuals being present in the tree sequence (in particular, each sample genome must belong to an individual) and (b) individuals being diploid?

It would be useful to clarify this somewhere, and state what happens e.g. if individuals are haploid, triploid, or whatever.

#27 is probably relevant here too.

@daikitag
Copy link
Collaborator

daikitag commented Sep 19, 2023

Hi Yan, thank you very much for your feedback.
(a) I'm sorry, but I'm not entirely sure about what you mean here. tstrait can simulate individuals with internal nodes, and it is extracting individuals by using nodes_individual property of a tree sequence https://github.com/tskit-dev/tstrait/blob/main/tstrait/genetic_value.py#L142. Would it be possible for you to let me know the details of the point made here?
(b) tstrait does not assume that individuals are diploids, and it has been tested in unit tests that it works for haploid, troploid, etc

@hyanwong
Copy link
Member Author

hyanwong commented Sep 19, 2023

Thanks @daikitag - re (a), I mean that if there are no individuals in the tree sequence (e.g. if you try it on a tree sequence generated using msprime.simulate(100, length=10000, mutation_rate=1e-2)), it won't work. I assume you report on all the individuals in the tree sequence that have at least one sample node associated with them (what do you do with individuals which either don't have any nodes, or which have only non-sample nodes?)

re (b) - that's great, thanks. It's probably worth saying in the docs somewhere that the trait values are simply added up (are they?) over all the genomes in each individual, regardless of ploidy. This will presumably change if you implement dominance? If you do implement dominance, I assume you might have to check that all individuals are diploid (or will have to have a more sophisticated model of dominance that I'm used to)

@daikitag
Copy link
Collaborator

Thank you for your feedback. For (a), I will try adding a code to raise an error when there are no individuals in the tree sequence data, as a non-informative error is raised right now. tstrait can work on individuals having non-sample nodes, and it is tested in unit tests.

For (b), I will add that to the documentation. I haven't implemented dominance to tstrait yet, as we have to check that all individuals are diploids, and I'm planning to do that in the future work.

Thank you for your suggestions.

This was referenced Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants