Genome Wide Discoveries in 13,000 Whole Genome Sequenced Rare Disease Cases and Controls

This abstract has open access
Abstract Summary

To study genetic sequence variants underlying unresolved Mendelian disorders and improve interpretation of already identified high penetrance variants, a collection of 13,000 individuals with a rare disease and their relatives has been whole genome sequenced with an average 30x coverage. Participants were recruited at 57 National Health Service (NHS) hospitals in the UK and 26 non-UK hospitals using approved eligibility criteria for 15 different rare disease domains.

We describe the population structure including ethnicity and relatedness estimation, high level phenotypes collected using Human Phenotype Ontology (HPO) terms and quality control and summary metrics for samples and variants. The resource contains over 165 million unique variants (including 90, 3 and 6% SNVs, small insertions and deletions respectively) in the 10,258 genetically independent samples with 47% of variants previously unobserved in other large scale publically available genome datasets (e.g. gnomAD, HGMD, UK10K).

We summarise the curation of gene lists and pertinent findings in 2,000 unique diagnostic-grade genes for the 15 domains. Over 1200 reports assigning pathogenic or likely pathogenic causal variants have been issued following review by Multi-Disciplinary Teams. The diagnostic yield varied across the different domains from 0.5 to 55%, while the proportion of novel (compared to HGMD) causal variants ranged between 25 to 73%; causal variants in 10 genes have been reported that involve cross-domain findings, where the same gene is linked to different clinical phenotypes.

We show the power of a recently developed rapid Bayesian association test, BeviMed, to identify novel genes (n>30) and causal variants in the non-coding space of the genome and to provide independent validation of recent rare disease gene discoveries by others.

The rare disease pilot of the 100,000 Genomes Project has shown the feasibility of using whole genome sequencing across a national health system to deliver a molecular diagnosis for patients with inherited rare diseases and how a national resource of genotype accompanied by HPO-coded phenotypes provides a powerful platform for the identification of so far 46 novel diagnostic-grade genes.

Abstract ID :
Submission Type

Associated Sessions