A new computational method developed at the Broad Institute and MIT casts a wide net to help us identify any known virus strain in a sample.
When fighting viruses, it pays to do your DNA homework. However, these pathogenic bundles of genetic material are elusive, and fishing them out of a patient’s blood can become a Sisyphean task. During the Zika outbreaks of 2015-2016, for example, genetic sequencing efforts against the virus were hampered by a shortage of specimens, as it appears in very low concentrations in samples.
But there’s a CATCH
The new computational method (called “CATCH”, short for “Compact Aggregation of Targets for Comprehensive Hybridization”) can be used to design molecular “baits” for any known human-infecting virus and their strains. CATCH is especially exciting as the team showed it can capture even viruses that appear in low abundance in clinical samples, such as Zika. The method is meant to help small sequencing centers around the world keep an eye on viruses more efficiently and cost-effectively, nipping outbreaks in the bud.
“As genomic sequencing becomes a critical part of disease surveillance, tools like CATCH will help us and others detect outbreaks earlier and generate more data on pathogens that can be shared with the wider scientific and medical research communities,” said Christian Matranga, a co-senior author of the study.
Scientists typically look for viruses (even low-abundance ones) in a sample through “metagenomic” sequencing. This basically means they extract genetic material from the whole sample and then sift through it in search of a particular strand. However, viral material often gets lost among all the genes from other microbes or the patient’s own DNA, like a whisper at a rock concert.
One workaround this issue is to ‘enrich’ samples for a particular virus using genetic baits. These are short strands of RNA or DNA that bind to a target’s genetic material, immobilizing it so the rest of the sample can be washed away. They are, however, limited in use. Such baits (also referred to as ‘probes’) are tailor-made for a particular virus or microbe; in other words, you have to know exactly what you’re looking for.
The team set out to devise a method that applies this bait-mechanism to a wider range of targets in a sample, while also enriching for low-abundance microbes like Zika.
“We wanted to rethink how we were actually designing the probes to do capture,” said Hayden Metsky, lead author. “We realized that we could capture viruses, including their known diversity, with fewer probes than we’d used before.”
“To make this an effective tool for surveillance, we then decided to try targeting about 20 viruses at a time, and we eventually scaled up to the 356 viral species known to infect humans.”
CATCH allows users to design custom sets of probes and fish for any combination of microbial species, including viruses — even all forms of all viruses known to infect humans. Users can draw on the genomes of any known human virus uploaded to the National Center for Biotechnology Information’s GenBank sequence database. CATCH then determines the best set of probes based on what each user is looking for, which are then synthesized by a third party.
Tests of CATCH-picked probes showed they can enrich viral content in a sample 18-fold (compared to pre-enrichment levels). The team worked with 30 samples, containing 8 viruses in known quantities. They also looked at samples from the 2018 Lassa outbreak in Nigeria — which were difficult to impossible to sequence without enrichment — which they successfully sequenced using CATCH-designed probes for all known human viruses. Finally, they were able to improve viral detection in samples with unknown content from patients and mosquitos.
Metsky and his team also used CATCH to generate probes for Zika and Chikungunya, two mosquito-borne viruses that largely share the same geographic range. The genetic data they retrieved this way showed that the Zika virus had been introduced in several regions months before scientists were able to detect it, a finding that can inform efforts to control future outbreaks.
The team hopes their method will help laboratories in the field — especially those in West Africa — deal with viral outbreaks. It provides an especially useful tool against undiagnosed fevers with suspected viral causes.
“We’d like our partners in Nigeria to be able to efficiently perform metagenomic sequencing from diverse samples, and CATCH helps them boost the sensitivity for these pathogens,” said Siddle.
“We’re excited about the potential to use metagenomic sequencing to shed light on those cases and, in particular, the possibility of doing so locally in affected countries.”
CATCH is also very versatile. As new strains are identified and their sequences added to GenBank, users can quickly redesign a set of probes with up-to-date information. In addition, Metsky and Siddle have made all the probes they designed using the tool publicly available (probe designs are usually proprietary). Users have access to the actual probe sequences in CATCH, allowing researchers to explore and customize the probe designs before they are synthesized.
The CATCH software is publicly accessible on GitHub.
The paper “Capturing sequence diversity in metagenomes with comprehensive and scalable probe design” has been published online in the journal Nature Biotechnology.