Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cross-trait LDSC #352

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Add cross-trait LDSC #352

wants to merge 7 commits into from

Conversation

jean997
Copy link

@jean997 jean997 commented Jul 30, 2022

Hi Florian,
Thanks for making this nice package. I added a function to compute genetic correlation using cross-trait LDSC. Here are the changes I made:

  • Added function snp_ldsc_rg for cross-trait LD-score regression
  • Modified snp_ldsc so that it can be called from snp_ldsc_rg. It should be back-compatible -- it can still be called in the same syntax for regular heritability calculation.
  • Added weight functions, one for cross-trait and one for single trait, WEIGHTS_rg and WEIGHTS_h2. WEIGHTS_h2 is the same as the original WEIGHTS but includes a factor of 2 that is erroneously missing form the paper (the variance of $\chi^2_j$ should be $2(1 + Nh^2_gl_j/M)^2 $). This factor of 2 is in the original ldsc python code, see line 532 here. I don't think it makes much difference.

Of course merge or not as you see fit.
Jean

@privefl
Copy link
Owner

privefl commented Jul 30, 2022

Thanks for this!
I'll look at it next week.
Could you point me to some resources you used to base your implementation on?

@jean997
Copy link
Author

jean997 commented Aug 1, 2022

Yes. I used the supplementary material of

https://pubmed.ncbi.nlm.nih.gov/26414676/
specifically the sections "Cross-trait LD score regression" and "Regression Weights"

I also referenced the ldsc python code. Most relevant is lines 538-711 of regressions.py.

I realized that I didn't add a check for negative heritability which means that it will produce a messing "NaNs produced" warning in that case. I also didn't add options to specify the heritability intercepts. The procedure is

  • Compute h2 for trait 1 (here is where you could use an additional intercept parameter if desired)
  • Compute h2 for trait 2
  • Compute genetic covariance using the same procedure as use for the h2 calculation except the input data is z1*z2 instead of z1^2 or z2^2 and the regression weights are different and a function of h2_1 and h2_2.
  • Convert genetic covariance to genetic correlation by dividing by sqrt(h2_1 *h2_2). (here is where it would be good to check for negative values)

I also didn't add the standard error of the genetic correlation but it does provide the SE of the directly estimated parameters.

@privefl
Copy link
Owner

privefl commented Aug 1, 2022

Thanks for the refs.
Yes, there is still quite some work to be done.

@alek0991
Copy link

alek0991 commented Aug 23, 2022

Hi Florian,
Are you planning to merge this great feature proposed by Jean anytime soon? It will be extremely useful.

@privefl
Copy link
Owner

privefl commented Aug 23, 2022

I agree it would be great to have this in the package.

However, there is still quite some work to be done to finish implementing this.
And I have absolutely no time working on this for at least a month.

@jean997
Copy link
Author

jean997 commented Aug 24, 2022

@privefl If you let me know what needs to be done I can see if we will have time to do it in the course of our work.

@privefl
Copy link
Owner

privefl commented Aug 24, 2022

I had made a list, but I can't find it..

It would be great to have something similar to what is in {GenomicSEM}, i.e. allow any number of traits.
We need to have standard errors with the estimates.
And we absolutely need unit tests to make sure we're getting results right.

@jean997
Copy link
Author

jean997 commented Aug 24, 2022

All makes sense. It currently produces SEs for the intercept and the genetic covariance but not genetic correlation which requires implementation of the ratio jacknife. I'll add some tests when I get bits of time. I think providing similar functionality to GenomicSEM can be done pretty easily with a wrapper where you can say if you want out the environmental correlation matrix, the genetic correlation matrix, or both.

@privefl
Copy link
Owner

privefl commented Aug 25, 2022

Also, it would be great if the snp_ldsc_rg() was a completely separated function so that I am sure that no change has been made to snp_ldsc().

If you have any question, feel free to ask me here.

Thanks for working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants