For the first time, scientists have determined the complete sequence of a human chromosome, namely the X chromosome, from ‘telomere to telomere’. This is truly a complete sequencing of a human chromosome, with no gaps in the base pair read and at an unprecedented level of accuracy.
A step closer towards the complete blueprint of a human being
The Human Genome Project was a 13-year-long, publicly funded project initiated in 1990 with the objective of determining the DNA sequence of the entire human genome.
Although the project was met with initial skepticism by scientists and non-scientists alike, the overwhelming success of the Human Genome Project is readily apparent. Not only did it usher in a new era in medicine, but it also led to significant advances in DNA sequencing technology.
When the Human Genome Project was finished, its running costs tallied $2.7 billion of taxpayers’ money. Today, a human genome can be sequenced for less than $200 — that’s a 13.5-million-fold reduction in cost. And, it’s still going down.
However, despite its resounding success, the human genome sequencing is still incomplete, as still unknown regions of the genome could not be finished due to technical reasons.
These gaps in the genome have been gradually filled as technically improved after the Human Genome Project was officially over in 2003.
But, until last year, there were still 100 or so regions that were yet unknown. Now, some of these regions have been brought to light, helping to complete the sequencing of the human X chromosome.
The X chromosome is one of two sex-determining chromosomes passed down from parent to child. A zygote that receives two X chromosomes – one from each parent – will grow into a female, while an X and a Y chromosome result in a male.
According to Karen Miga, a research scientist at the UC Santa Cruz Genomics Institute, this was all possible thanks to new sequencing technologies that enable “ultra-long reads,” such as the nanopore sequencing technology.
In the initial stages of the Human Genome Project, scientists could read 500 bases at a time, or 500 letters per sequence. In the mid-2000s, the amount of DNA that could be read at a time was reduced (100-200 bases), but the accuracy of technology increased. Then around 2010, new technology came on the market that could read 1,000-10,000, and now more recently 100,000 or more bases at a time thanks to nanopore technology.
Nanopore tech involves funneling single molecules of DNA through a tiny hole. Changes in current flow determine the genetic sequencing.
“These repeat-rich sequences were once deemed intractable, but now we’ve made leaps and bounds in sequencing technology,” Miga said. “With nanopore sequencing, we get ultra-long reads of hundreds of thousands of base pairs that can span an entire repeat region, so that bypasses some of the challenges.”
The technique itself was very simple: simply collect as much of these bases that scientists could from a single cell line of interest.
“We chose a unique cell line that has two copies of every chromosome, just like any normal cell, but each of those copies is identical to one another. Rather than having to resolve the genome of two genomes, we only had a single version to worry about. Then you can grow these cell lines clonally, so you don’t have variation in them, and then sequence them on these instruments,” Dr. Adam Phillippy of the National Human Genome Research Institute said in a statement.
Scientists collected data over the course of six months, and then used algorithms to stitch the puzzle pieces back together again.
This is how they sequenced the centromere, a large repetitive bit of sequence that is centered in the middle of the X chromosome as its name might suggest, and a number of other genome arrays on the X chromosome.
This work opens up a range of new possibilities in research, including the prospect of identifying new associations between genetic sequence variation and disease, as well as new clues into human biology and evolution.
“We’re starting to find that some of these regions where there were gaps in the reference sequence are actually among the richest for variation in human populations, so we’ve been missing a lot of information that could be important to understanding human biology and disease,” Miga said in a statement.
The complete sequencing of the X chromosome signifies yet another massive victory for science. However, there are still 23 other chromosomes to go — all of them might be completely mapped out by the end of this year, the researchers said.
The findings appeared in the journal Nature.