homehome Home chatchat Notifications


Scientists unveil the first human 'pangenome': a new frontier in genomics

The human "pangenome" is a full genetic sequence that incorporates genomes from not just one individual, but 47.

Tibi Puiu
May 12, 2023 @ 3:23 pm

share Share

Computer generated image of a human with genetics.
Credit: Pixabay.

The human genome was sequenced for the first time in 2003, tremendously influencing genetics and biological research ever since. But despite huge leaps in our understanding of the human blueprint, these early efforts were not complete. It wasn’t until 2022 that scientists filled the gaps and sequenced the complete human genome — and we’re still not done yet.

Now, a consortium of scientists has raised the bar once more by publishing the first human “pangenome”, which incorporates the genomes of 47 individuals from across the world.

By combining the genetic material of 47 individuals from diverse ancestral backgrounds, this revolutionary reference provides an unparalleled understanding of the intricacies of human genetic diversity.

In an extraordinary series of papers published today in prestigious scientific journals, this game-changing achievement promises to reshape the landscape of genomics, delivering profound insights into our shared human heritage.

Unlocking the Secrets of Genetic Diversity: The Human Pangenome Revolution

Schematic of pangenome.
The new draft pangenome reference contains 47 genomes instead of just one, which makes it a much more powerful benchmark for genetic research. Credit: National Human Genome Research Institute.

Although 99.9% of the genome is the same from person to person, there is a lot of diversity found in that final 0.1%. Even with the complete human genome that scientists published last year, 70% of the sequence scientists use to benchmark genetic variation still comes from a single person.

Needless to say, this is a problem for genetic research. This limitation introduces a phenomenon known as reference bias, hindering our ability to comprehensively analyze genomes.

Enter the pangenome. Unlike its predecessor, the pangenome embraces diversity and inclusivity, blending the genomes of 47 individuals from diverse ancestral backgrounds.

Visualize it as a tapestry of human variation, where common genetic sequences form a seamless path while diverging areas unveil the unique genetic footprints of different populations.

With the addition of 119 million DNA bases—those fundamental “letters” that form our genetic code—the pangenome transcends the limitations of a single reference genome.

In the process, the pangenome offers unparalleled accuracy, completeness, and, most importantly, an extraordinary ability to uncover genetic variants that have previously eluded our grasp.

A New Genomic Frontier Unveiled

Genomic variation comes in various forms—ranging from subtle differences in individual DNA bases to larger structural variants that span 50 base pairs or more. These structural variants can profoundly impact our health.

However, until now, our ability to identify them has been severely limited. A mere 30% of these structural variants have been detectable with existing technologies and the constraints of a single reference genome.

Within the 119 million new bases added to the pangenome, approximately 90 million derive from structural variations. These include inversions, insertions, deletions, and tandem repeats (segments of genetic material repeated multiple times).

These additional bases open doors to uncharted territories of the genome, shedding light on regions that previously lacked reference, and potentially unraveling associations between structural variants and diseases such as autism, schizophrenia, immune disorders, and coronary heart disease.

The implications of this breakthrough extend far beyond structural variants. When it comes to identifying smaller genetic variations, such as single-base changes, the pangenome outshines its predecessor with a 34% increase in accuracy.

By harnessing the vast amount of data present in the pangenome, scientists can now uncover these minute variations with unparalleled precision.

Mapping the Road to Inheritance

Within each of us lies a paired set of chromosomes—one inherited from our mother, the other from our father. The pangenome is now helping scientists unravel this complex web of inheritance.

With the power of haplotype resolution, the pangenome confidently distinguishes between the two sets of parental chromosomes, something that used to be extremely challenging before—a remarkable scientific achievement. This newfound knowledge empowers scientists to delve deeper into the mysteries of gene inheritance and the role it plays in various diseases.

This also means that the pangenome encompasses not just one but an astounding 94 distinct genome sequences. And the journey doesn’t end there. By 2024, the researchers plan to expand this collection to include 700 reference genomes, encompassing an even broader spectrum of human genetic diversity.

A Tapestry Woven with Precision

Behind the scenes, a symphony of computational techniques and cutting-edge algorithms has brought the pangenome to life through the Human Pangenome Reference Consortium (HPRC). Dozens of researchers from many institutions across the US, UK, and Germany participated in this landmark achievement.

For instance, the UC Santa Cruz Computational Genomics lab, led by Benedict Paten, has spearheaded the development of advanced methods to align multiple genome sequences into a unified structure—an intricate pangenome graph. Within this graph, shared paths represent regions of similarity, while diverging paths highlight areas of genetic variation.

These paths are meticulously crafted, ensuring that each genome within the pangenome reference attains exceptional quality and accuracy.

“The draft pangenome is an important proof of principle that we hope is going to influence a lot of people and get them thinking about the pangenome and how it might affect their work,” Paten said.

“Looking ahead, we see a lot of engagement with other groups—it takes a lot of different people to build something that is going to become a big community resource.”

All of the 47 diploid genomes were sourced from volunteers who participated in the 1000 Genomes Project (1000G) and agreed to share their anonymized genetic sequences in publically available databases. These openly consented samples—sourced from diverse backgrounds—pave the way for unrestricted access to this invaluable resource without the privacy barriers that typically accompany genome research.

Surely, this is just the beginning.

Overall, by incorporating the genomes of dozens of individuals from around the world, the pangenome provides a far richer source of data than the original reference genome.

“We are introducing more diversity and equity into the reference by sampling diverse human beings and including them in this structure that everyone can use,” said Paten, who is the senior author on the main marker paper. “One genome isn’t enough to represent everybody—the pangenome will ultimately be something that is inclusive and representative.”

The new findings were described in four separate papers published today in the journal Genome Research, Nature, Nature Biotechnology, and Nature Methods.

share Share

How Hot is the Moon? A New NASA Mission is About to Find Out

Understanding how heat moves through the lunar regolith can help scientists understand how the Moon's interior formed.

This 5,500-year-old Kish tablet is the oldest written document

Beer, goats, and grains: here's what the oldest document reveals.

A Huge, Lazy Black Hole Is Redefining the Early Universe

Astronomers using the James Webb Space Telescope have discovered a massive, dormant black hole from just 800 million years after the Big Bang.

Did Columbus Bring Syphilis to Europe? Ancient DNA Suggests So

A new study pinpoints the origin of the STD to South America.

The Magnetic North Pole Has Shifted Again. Here’s Why It Matters

The magnetic North pole is now closer to Siberia than it is to Canada, and scientists aren't sure why.

For better or worse, machine learning is shaping biology research

Machine learning tools can increase the pace of biology research and open the door to new research questions, but the benefits don’t come without risks.

This Babylonian Student's 4,000-Year-Old Math Blunder Is Still Relatable Today

More than memorializing a math mistake, stone tablets show just how advanced the Babylonians were in their time.

Sixty Years Ago, We Nearly Wiped Out Bed Bugs. Then, They Started Changing

Driven to the brink of extinction, bed bugs adapted—and now pesticides are almost useless against them.

LG’s $60,000 Transparent TV Is So Luxe It’s Practically Invisible

This TV screen vanishes at the push of a button.

Couple Finds Giant Teeth in Backyard Belonging to 13,000-year-old Mastodon

A New York couple stumble upon an ancient mastodon fossil beneath their lawn.