Skip to content

Population frequency

Nuclear genome

A population frequency filter is applied to prevent common variants being prioritised. For a variant to pass this filter, the population frequency cannot exceed any of the thresholds applied for the mode of inheritance being considered. Variants for which there is no allele population provided in the particular data set and population combination are considered zero.

The datasets used for allele frequency filtering are:

  • gnomAD genomes v3.1.2
  • gnomAD exomes v2.1.1
  • Genomics England allele frequencies

The current GRCh38 allele frequency thresholds for the following populations are as follows:

Dataset tag Population Dataset size (individuals) Dominant inherited disease Recessive inherited disease
GNOMAD_EXOMES AMR 17,296 0.001 0.01
GNOMAD_EXOMES ASJ 5,040 0.001 0.01
GNOMAD_EXOMES EAS 9,157 0.001 0.01
GNOMAD_EXOMES FIN 10,824 0.001 0.01
GNOMAD_EXOMES NFE 56,855 0.001 0.01
GNOMAD_EXOMES SAS 15,308 0.001 0.01
GNOMAD_GENOMES AFR 20,744 0.001 0.01
GNOMAD_GENOMES AMI 456 0.100 0.10
GNOMAD_GENOMES AMR 7,647 0.001 0.01
GNOMAD_GENOMES ASJ 1,736 0.003 0.01
GNOMAD_GENOMES EAS 2,604 0.002 0.01
GNOMAD_GENOMES FIN 5,316 0.001 0.01
GNOMAD_GENOMES MID 158 0.100 0.10
GNOMAD_GENOMES NFE 34,029 0.001 0.01
GNOMAD_GENOMES SAS 2,419 0.002 0.01
GEL_aggCOVID_DRAGENv4.0-20230921
(internal ref: 20230921-aggDRAGENv4.0_COVID_v1.1-AFgt0)
Genomics England custom frequencies 5,415 0.001 0.01

Note

gnomAD frequencies are extracted from gnomAD genomes v3.1.2 and gnomAD exomes v2.1.1. Variants present in gnomAD can receive flags that impact the variant's annotation and/or confidence - further information on possible flags is available through the gnomAD website. Variant frequencies in gnomAD are considered in the Rare Disease tiering pipeline regardless of the flags present.

Note

Several populations considered during variant tiering in earlier versions of the variant tiering pipeline are no longer considered, including: UK10K, 1000 Genomes Phase 3 and DiscovEHR

Mitochondrial genome

From the Orion NGIS release onwards (see release dates), there is consideration of mitochondrial allele frequencies in gnomAD during variant tiering. This includes consideration of homoplasmic or near-homoplasmic variants (95-100% allele fraction) in gnomAD, and exclusion of some genomic sequencing datasets that were included in the nuclear genome datasets for the same release. More details of the cohort composition and allele frequency data generation are available through the gnomAD webpage.

The datasets used for allele frequency filtering are:

The current GRCh38 allele frequency thresholds for the following populations are as follows:

Dataset tag Population Dataset size (individuals) Mitochondrial genome inherited disease
GNOMAD_MT AFR 14,347 0.001
GNOMAD_MT AMI 392 0.1
GNOMAD_MT AMR 5,718 0.001
GNOMAD_MT ASJ 1,415 0.003
GNOMAD_MT EAS 1,482 0.002
GNOMAD_MT FIN 4,892 0.001
GNOMAD_MT NFE 25,849 0.001
GNOMAD_MT SAS 1,493 0.002
GEL_aggCOVID_DRAGENv4.0-20230921
(internal ref: 20230921-aggDRAGENv4.0_COVID_v1.1-AFgt0)
Genomics England custom frequencies 5,415 0.001