Population frequency¶
Nuclear genome¶
A population frequency filter is applied to prevent common variants being prioritised. For a variant to pass this filter, the population frequency cannot exceed any of the thresholds applied for the mode of inheritance being considered. Variants for which there is no allele population provided in the particular data set and population combination are considered zero.
The datasets used for allele frequency filtering are:
The current GRCh38 allele frequency thresholds for the following populations are as follows:
Dataset tag | Population | Dataset size (individuals) | Dominant inherited disease | Recessive inherited disease |
---|---|---|---|---|
GNOMAD_EXOMES | AMR | 17,296 | 0.001 | 0.01 |
GNOMAD_EXOMES | ASJ | 5,040 | 0.001 | 0.01 |
GNOMAD_EXOMES | EAS | 9,157 | 0.001 | 0.01 |
GNOMAD_EXOMES | FIN | 10,824 | 0.001 | 0.01 |
GNOMAD_EXOMES | NFE | 56,855 | 0.001 | 0.01 |
GNOMAD_EXOMES | SAS | 15,308 | 0.001 | 0.01 |
GNOMAD_GENOMES | AFR | 20,744 | 0.001 | 0.01 |
GNOMAD_GENOMES | AMI | 456 | 0.100 | 0.10 |
GNOMAD_GENOMES | AMR | 7,647 | 0.001 | 0.01 |
GNOMAD_GENOMES | ASJ | 1,736 | 0.003 | 0.01 |
GNOMAD_GENOMES | EAS | 2,604 | 0.002 | 0.01 |
GNOMAD_GENOMES | FIN | 5,316 | 0.001 | 0.01 |
GNOMAD_GENOMES | MID | 158 | 0.100 | 0.10 |
GNOMAD_GENOMES | NFE | 34,029 | 0.001 | 0.01 |
GNOMAD_GENOMES | SAS | 2,419 | 0.002 | 0.01 |
GEL_aggCOVID_DRAGENv4.0-20230921 (internal ref: 20230921-aggDRAGENv4.0_COVID_v1.1-AFgt0) |
Genomics England custom frequencies | 5,415 | 0.001 | 0.01 |
Note
gnomAD frequencies are extracted from gnomAD genomes v3.1.2 and gnomAD exomes v2.1.1. Variants present in gnomAD can receive flags that impact the variant's annotation and/or confidence - further information on possible flags is available through the gnomAD website. Variant frequencies in gnomAD are considered in the Rare Disease tiering pipeline regardless of the flags present.
Note
Several populations considered during variant tiering in earlier versions of the variant tiering pipeline are no longer considered, including: UK10K, 1000 Genomes Phase 3 and DiscovEHR
Mitochondrial genome¶
From the Orion NGIS release onwards (see release dates), there is consideration of mitochondrial allele frequencies in gnomAD during variant tiering. This includes consideration of homoplasmic or near-homoplasmic variants (95-100% allele fraction) in gnomAD, and exclusion of some genomic sequencing datasets that were included in the nuclear genome datasets for the same release. More details of the cohort composition and allele frequency data generation are available through the gnomAD webpage.
The datasets used for allele frequency filtering are:
- gnomAD genomes v3.1.2
The current GRCh38 allele frequency thresholds for the following populations are as follows:
Dataset tag | Population | Dataset size (individuals) | Mitochondrial genome inherited disease |
---|---|---|---|
GNOMAD_MT | AFR | 14,347 | 0.001 |
GNOMAD_MT | AMI | 392 | 0.1 |
GNOMAD_MT | AMR | 5,718 | 0.001 |
GNOMAD_MT | ASJ | 1,415 | 0.003 |
GNOMAD_MT | EAS | 1,482 | 0.002 |
GNOMAD_MT | FIN | 4,892 | 0.001 |
GNOMAD_MT | NFE | 25,849 | 0.001 |
GNOMAD_MT | SAS | 1,493 | 0.002 |
GEL_aggCOVID_DRAGENv4.0-20230921 (internal ref: 20230921-aggDRAGENv4.0_COVID_v1.1-AFgt0) |
Genomics England custom frequencies | 5,415 | 0.001 |