Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDpred2: LD window, imputation,etc #130

Open
deepchocolate opened this issue Feb 10, 2023 · 2 comments
Open

LDpred2: LD window, imputation,etc #130

deepchocolate opened this issue Feb 10, 2023 · 2 comments
Assignees

Comments

@deepchocolate
Copy link
Contributor

deepchocolate commented Feb 10, 2023

Is your feature request related to a problem? Please describe.
These are just some modifications of the LDpred2 R-scripts discussed in #117 and elsewhere.

Imputation: The current version of LDpred2.R will impute genotypes each time it is run. It would be better to move the imputation step to createBackingFile.R as the imputation step takes a significant amount of time. Solved by usecases/LDpred2/imputeGenotypes.R

SNP window for LD: calculateLD.R only accepts a window defined in centimorgans. As suggested by @ofrei it would be good to let the user specify this window in centimorgans, basepairs or SNP index.

Tutorial: Add some more details regarding arguments such as --col-snp-id and improve error output to point out to user how to solve an error if possible.

Matching genotype and sumstats: snp_match can match using CHR, A0, A1 and one of BP or RSID. ldpred2.R currently throws an error if any of them are not available. Fixed

Minor discussion point: The createBackingFile.R could probably be renamed to something more informative. It creates two files, and it specifically only works with bed-files as input. Maybe convertBedToBigSNPR.R or convertBedToRDS.R. The latter rhymes better with the arguments to calculateLD.R and ldpred2.R as the input genotype file is placed in flag --geno-file-rds. (The .bk holds the actual data and the .rds-file is meta data unless I missed something.)

Various
This condition seems to have no purpose at the moment. Since --chr2use has a default it should probably always run.

if (TRUE) {
cat('Filtering SNPs based on --chr2use\n')
nSNPsBefore <- nrow(sumstats)
sumstats <- sumstats[sumstats$chr %in% chr2use,]
nSNPsAfter <- nrow(sumstats)
cat('Retained', nSNPsAfter, 'out of', nSNPsBefore,'\n')
}

Fixed

Describe the solution you'd like

  • Move genotype imputation from LDpred2.R to createBackingFile.R
  • Add options for type of window to use in calculateLD.R. We add flags for --ld-window-snps (SNPs raw index in .bim file), --ld-window-kb (distance in kilobase pairs, BP column in .bim file), and --ld-window-cm (current use with --window, GP column in .bim file). The user should only be able to specify one of these options.
  • Potentially change the name of createBackingFile.R

Describe alternatives you've considered
None

Additional context
None

@deepchocolate deepchocolate self-assigned this Feb 10, 2023
@deepchocolate deepchocolate changed the title LDpred2: LD window, imputation LDpred2: LD window, imputation,etc Feb 10, 2023
@deepchocolate

This comment was marked as resolved.

@deepchocolate deepchocolate mentioned this issue Feb 16, 2023
4 tasks
@github-actions
Copy link

This issue appears to be stale due to non-activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant