homehome Home chatchat Notifications


More than 24,000 AI-readable coronavirus scientific articles go online

The sum of human knowledge on the new coronavirus is now online, in a format readable by artificial intelligence.

Tibi Puiu
March 19, 2020 @ 2:05 am

share Share

Credit: Pixabay.

Scientists all over the world are racing around the clock on candidate vaccines, antiviral treatments, and just about anything they can throw at the novel coronavirus. In order to aid their efforts and accelerate unprecedented scientific action, a database that pools more than 24,000 research papers related to SARS-CoV-2 (the scientific name for the virus that causes the COVID-19 pandemic) and other coronaviruses is now online in a single place.

The most comprehensive coronavirus scientific database

The Covid-19 Open Research Dataset (CORD-19) is the work of several philanthropic and research organizations, including The National Library of Medicine (NLM) at the National Institutes of Health, the Allen Institute for AI, Georgetown University, the Chan Zuckerberg Initiative, Kaggle, Microsoft, and the White House Office of Science and Technology Policy (OSTP).

Each organization contributed with resources and know-how to the best of their ability. For instance, the NLM provided access to scientific literature while Microsoft used its engineering abilities to index and map all these thousands of articles that were scattered across the web. The Allen Institute for Artificial Intelligence (AI2), a non-profit, converted all the articles into a common structured format that can be parsed by algorithms.

Additionally, the entire dataset is machine-readable, allowing artificial intelligence (AI) systems to access and interpret the huge body of knowledge. This way, scientists might find existing safe drugs and therapies designed to treat other conditions that could prove useful in the current war on the coronavirus. Or perhaps they might find a chink in the coronavirus’ armor that has so far escaped scientists.

Previously, Microsoft researchers had employed machine learning and natural language analysis to interpret the content of thousands of biomedical papers. This initiative led to a representation of cellular regulatory networks that was exploited to make recommendations for cancer therapies.

According to MIT Technology Review, the dataset is part of AI2’s Semantic Scholar service, which employs natural language models like ELMo and BERT to plot relationships between papers.

For a long time, there has been a fierce debate among scholars regarding access to scientific papers, many of which are behind paywalls controlled by a handful of publishers.

Proponents of open access — free, unrestricted access to scientific papers — will be at least happy to learn that in this situation great efforts have been made to ensure the global research community has unhindered access to the coronavirus-related papers.

“It’s my hope that the machine-readable content will stimulate advances in computing methods that can help investigators to develop deeper understandings and approaches to addressing the COVID-19 pandemic. Developing tools to help scientists to do research and synthesize new understandings has been a long-term aspiration in AI. Work has been underway over years on methods that can answer questions, analyze and summarize the content of numerous scientific papers, assess the credibility of clinical trials, generate and test hypotheses, and guide experimentation,” Eric Horvitz, Technical Fellow and Chief Scientific Officer at Microsoft, wrote in a recent blog post.

The dataset also includes pre-publication research posted on servers like medRxiv and bioRxiv, which are open access archives for pre-print health sciences and biology research.

“Sharing vital information across scientific and medical communities is key to accelerating our ability to respond to the coronavirus pandemic,” Chan Zuckerberg Initiative Head of Science Cori Bargmann said refering to the CORD-19 project.

share Share

The surprising health problem surging in over 50s: sexually transmitted infections

Doctors often don't ask older patients about sex. But as STI cases rise among older adults, both awareness and the question need to be raised.

Kids Are Swallowing Fewer Coins and It Might Be Because of Rising Cashless Payments

The decline of cash has coincided with fewer surgeries for children swallowing coins.

Horses Have a Genetic Glitch That Turned Them Into Super Athletes

This one gene mutation helped horses evolve unmatched endurance.

Scientists Discover Natural Antibiotics Hidden in Our Cells

The proteasome was thought to be just a protein-recycler. Turns out, it can also kill bacteria

Future Windows Could Be Made of Wood, Rice, and Egg Whites

Simple materials could turn wood into a greener glass alternative.

Researchers Turn 'Moon Dust' Into Solar Panels That Could Power Future Space Cities

"Moonglass" could one day keep the lights on.

Ford Pinto used to be the classic example of a dangerous car. The Cybertruck is worse

Is the Cybertruck bound to be worse than the infamous Pinto?

Archaeologists Find Neanderthal Stone Tool Technology in China

A surprising cache of stone tools unearthed in China closely resembles Neanderthal tech from Ice Age Europe.

A Software Engineer Created a PDF Bigger Than the Universe and Yes It's Real

Forget country-sized PDFs — someone just made one bigger than the universe.

The World's Tiniest Pacemaker is Smaller Than a Grain of Rice. It's Injected with a Syringe and Works using Light

This new pacemaker is so small doctors could inject it directly into your heart.