-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WGS data #104
Comments
Are there any updates on this issue? I believe I am having a similar problem in our implementation of the pipeline for GATK WGS. |
Hi, See here: rgcgithub/regenie#114 (comment) |
I got it to work by converting the VCF to BED using plink2. I also saw in several other issues, such as rgcgithub/regenie#209, that a Oxford Sample file may help with the missing values error, which kept occurring for me, so I generated one as well, again using plink2. The only tricky part was to keep the IID and FID consistent with the internal workings of the pipeline, but now it seems to run fine. EDIT: Here are the PLINK2 commands for reference. VCF-to-BED: plink2 --vcf ${input_vcf_file} \
--fam ${path}/samples-sex.nf_gwas.psam \
--double-id \
--split-par 'hg38' \
--output-chr chrM \
--set-all-var-ids @:#:ref\$r-alt\$a --new-id-max-allele-len 527 \
--make-bed \
--out ${output_path} Making Oxford .sample file plink2 --vcf ${input_vcf_file} \
--fam ${path}/samples-sex.nf_gwas.psam \
--split-par 'hg38' \
--output-chr chrM \
--set-all-var-ids @:#:ref\$r-alt\$a --new-id-max-allele-len 527\
--recode oxford \
--out ${output_path} As mentioned, I added the oxford .sample file, because of several missing values/ invalid sample names errors, as linked in the issue above. |
Great to hear. Can you also share the commands, in case someone else is running into the same issue? |
The pipeline looks like it is optimized for processing imputed vcf data from UMICH or TOPMed imputation server which generates a DS field. Is is possible to run the pipeline on GATK WGS sequencing data without the DS field. Or does that need to be calculated with the PL field and written out to plink format before running the pipeline?
The text was updated successfully, but these errors were encountered: