This reposirtory contains code and resources used for the quality control of genotyping data from the Norwegian Mother, Father and Child Cohort Study (MoBa).
The current release is available for alpha testing for expert users. We expect a new release to be publicly available for beta testing in the first semester of 2025. For more information on the release process, please contact the release coordinator Dana Kristjansson.
👉 Release notes and documentation: Documentation on the current and previous releases.
Samples in MoBa have been genoptyped in different batches on different genotyping platforms. Variants are not excluded based on batch effects, we recommend to account for batch effects in your own analyses. A table listing the batch for each sample is available with the data release. Information on batches is available at the mobagen repository.
We conducted only minimal filtering to allow you to tailor the release to your project: the release includes genotypes with imputation quality > 0.3 and all samples including duplicates and outliers in the PCA. Accompanying tables are available with variant and sample information to help you filter the data for your specific application. Genotypes are available for autosomal and sex chromosomes, please note that variant coverage varies between batches, especially for the sex chromosomes. The genotypes are provided as phased dosages in the plink2 format, a single file containing genotypes of parents and children. For details on the format please refer to the plink documentation.
👉 For more information on the data processing, please go to the QC pipeline documentation.
Currently genotypes are only available on build 37 imputed against the Haplotype Reference Consortium Release 1.1 (11,227 samples).
The current pipeline and its associated releases have been built on previous efforts:
- MoBa PsychGen: A quality-controlled set of genotypes created by the PsychGen Centre for Genetic Epidemiology and Mental Health.
- MoBaGenetics v.1.0: A quality-controlled set of genotypes obtained after a first round of genotyping created by the Genetics and Bioinformatics department of the Norwegian Institute of Public Health.
When using code or data produced using this pipeline, please refer to this repository and cite the reference work by the PsychGen team doi: 10.1101/2022.06.23.496289.
- GWAS: this folder contains code and resources used to run a test GWAS.
- Phenotypes: this folder contains code and resources for the handling of phenotypes used in the pipeline.
- QC Pipeline: this folder contains code, resources, and documentation for the QC pipeline
- Quarantine: this folder contains deprecated executables and resources.
If despite our efforts at enforcing good practices in our work, you encounter issues or have suggestions of improvement, please use our issue tracker. For technical questions, feedback, and assistance regarding the release please contact the Genetics and Bioinformatics department of the Norwegian Institute of Public Health. To reach out to the community working on genetics in MoBa, please join our Slack.
This repository is maintained by the Genetics and Bioinformatics department of the Norwegian Institute of Public Health, and notably Øystein Kapperud, Dana Kristjansson, Even Birkeland, and Marc Vaudel. The department is deeply grateful for the contribution of its former members Gutorm Thomas Høgåsen and Ronny Myhre.
The pipeline is built on work for the MoBaGenetics v.1.0 release, led by Øyvind Helgeland, Julius Juodakis, Jonas Bacelis, and Ronny Myhre.
It was extended thanks to help from the PsychGen team, especially Elizabeth Claire Corfield.
The pre-phasing and imputation modules have been built following the implementation of the NORMENT team and the PsychGen team, especially Oleksandr Frei and Tahir Tekin Filiz.
We are grateful for the contribution of the community who helped, shared code, and provided support and feedback. We are especially grateful to the PsychGen team, the NORMENT team, the Centre for Fertility and Health, the Department of Obstetrics and Gynecology at the University of Gothenburg, and the Mohn Center for Diabetes Precision Medicine.
The data are processed in the secure digital laboratories of the HUNT Cloud at the Norwegian University of Science and Technology. We are grateful for outstanding support from the HUNT Cloud community.
The code and resources generated for the QC pipeline is freely available under the permissive GNU General Public License v3.0. Other licenses might apply for external components of the pipeline. No copyright infringement intended.
As part of our efforts toward delivering open and inclusive science, we follow the Contributor Convenant Code of Conduct for Open Source Projects.