Whole genome sequencing
Whole genome sequencing (WGS) refers to DNA sequencing of the entire genome, including both coding and non-coding regions.
Clinical applications
WGS is a massively parallel sequencing technique. It describes the sequencing of an individual’s entire genome, including both protein-coding and non-protein coding (including regulatory) regions. WGS is the most comprehensive form of genomic testing currently in clinical use. It enables a wide range of variant types in a large number of genes to be tested for simultaneously.
It is important to be aware that virtual panels of genes may be used in clinical applications of WGS. This means that even though all of the genome is sequenced, only those genes known to be associated with the patient’s features are usually analysed. Therefore, it’s important to remember that just because you have requested a test that is described as a WGS panel test, this does not usually mean that all of your patient’s genome has been checked – just those genes that are included on the panel.
WGS is most commonly thought of as being used in cases of rare disease; however it is also increasingly used in patients with cancer, where WGS of the tumour’s (somatic) genome can be undertaken. If any pathogenic variants are identified that might also be present in the patient’s constitutional (germline) DNA, this can also be investigated. Testing in cancer patients can identify:
- somatic driver mutations in the tumour genome, which are clinically actionable and may affect eligibility for targeted treatment or clinical trials;
- constitutional (germline) mutations predisposing to cancer, with possible implications for management and surveillance of the patient and their families; and
- mutational signatures that may give information about mechanisms of disease or environmental mutagens.
WGS is also widely used in a research setting, for example to identify novel genetic causes of rare disease, or to characterise mutational signatures associated with different types of cancer.
In addition to using WGS on patient samples, it is also carried out to detect and classify infectious organisms, including tuberculosis and SARS-CoV-2.
The 100,000 Genomes Project
The first offer of WGS to patients in the NHS was through the 100,000 Genomes Project, launched in 2012. In this research study, 100,000 genomes from patients with cancer and rare disease were sequenced in order to make diagnoses, improve management, promote scientific discovery and drive the integration of genomics into healthcare.
How does it work?
Whole genome sequencing in the NHS is done using short-read next generation sequencing (NGS) technology. Briefly, patient DNA is fragmented and sequencing data are generated for the entire genome. Data analysis may then be restricted to a subset of genes relevant to the patient’s features using a virtual panel. All data are stored.
Advantages and limitations of WGS
Advantages
The main advantage of WGS is that it is the most comprehensive genomic test available. It can be used to test a wide range of genes simultaneously, detecting a range of variant types.
- Single nucleotide variants and small insertions and deletions are detected with a high accuracy.
- Uniform coverage of the whole genome allows better identification of copy number variants (CNVs) than in whole exome sequencing (WES).
- Some structural rearrangements (for example, balanced translocations) may be detected (such balanced rearrangements would be missed by arrays).
- In WGS, all of an individual’s genome is sequenced. This means that where an initial analysis does not yield a diagnosis, it may be possible to go back to the original data in the future to consider newly discovered causal genes. Development of policy on whether and when NHS labs will provide reanalysis of WGS and WES data is still in progress.
- WGS has the potential to detect variants in both protein-coding regions and non-coding, potentially regulatory regions. In practice, such regions may not always be analysed as the interpretation of non-coding variants can be difficult.
- In research applications, WGS can be used to identify novel causes of genetic disease.
Limitations
- When WGS is used for the diagnosis of rare disease patients, it is important to be aware that virtual panels are sometimes used. This means that, although sequencing data are generated for the whole genome, only genes known to be associated with the patient’s features are analysed.
- Clinical interpretation of the large number of identified variants is a significant challenge.
- Many more variants of uncertain significance are generated compared to more targeted testing.
- There is an increased risk of incidental findings compared to more targeted testing.
- There are regions of the genome that pose a technical challenge for short-read sequencing, the type of sequencing currently used for most diagnostic WGS. These regions, which can include those with pseudogenes or those that contain repetitive elements, may therefore not be analysed.
- CNV detection and structural variant detection may not be as accurate as when gold standard techniques are used (arrays and karyotyping, respectively).
- Not all conditions caused by a variation in the length of short tandem repeats (STRs) are detected, with polymerase chain reaction (PCR)-based STR detection methods often used separately.
- Methylation status of DNA is not currently detected. other methods are better suited to this, for example chromatin immunoprecipitation sequencing (ChIP-seq).
- Mosaicism may not be detected, as the read depth used is often limited.
- Currently, WGS test results can take longer to come back than results of other genomic tests.
Resources
For clinicians
- NHS England Genomics Education Programme: Requesting whole genome sequencing: information for clinicians
References:
- Taylor A, Alloub Z and Tayoun AA. ‘A simple practical guide to genomic diagnostics in a pediatric setting‘. Genes (Basel) 2021: volume 12, issue 6, page 818. DOI: 3390/genes12060818
- Berner AM, Morrissey GJ and Murugaesu N. ‘Clinical analysis of whole genome sequencing in cancer patients‘. Current Genetic Medicine Reports 2019, volume 7, pages 136–143. DOI: org/10.1007/s40142-019-00169-4