I remember years ago, when I got my first computer – it had a storage capacity of 40 MB. A few years after that, I got a 1 GB hard drive, and nowadays, 1 TB is quite the standard – that’s a growth by a factor of about 250.000. However, data storage capacity has slowed down its tumultous develpoment in the last couple of years, but researchers are still working, trying to find the next big thing; as a matter of fact, the next big thing could actually be biological (our DNA, to be more precise). Researchers have shown that a single cup of DNA can store 100 million hours of HD video – and this is just the first results.
Biological systems have been using DNA as an information storage molecule for billions of years – after all, it holds the information that makes you human, as opposed to, say, a badger. Vast amounts of data can be stored even in microscopic environments, so it’s only natural to start looking here. So could this actually be the ultimate solution ?
However, it’s very hard to “make” DNA carry the information you want, as researchers from the EMBL-European Bioinformatics Institute (EMBL-EBI) found out. In this week’s edition of Nature, they describe a new technique that stores, reads and writes data using DNA. The research was led by Nick Goldman and Ewan Birney.
“We already know that DNA is a robust way to store information because we can extract it from wooly mammoth bones, which date back tens of thousands of years, and make sense of it. It’s also incredibly small, dense and does not need any power for storage, so shipping and keeping it is easy,” Goldman said in a statement.
The method is complex, and to accomplish their goals, they emplyed the help of bio-analytics instrument maker Agilent Technologies, a former lab of Hewlett-Packard, to help synthesize DNA from encoded digital information—in this case, an MP3 of Martin Luther King’s “I Have a Dream” speech – quite a suitable tune.
“We knew we needed to make a code using only short strings of DNA, and to do it in such a way that creating a run of the same letter would be impossible,” Goldman explained. “So we figured, let’s break up the code into lots of overlapping fragments going in both directions, with indexing information showing where each fragment belongs in the overall code, and make a coding scheme that doesn’t allow repeats. That way, you would have to have the same error on four different fragments for it to fail—and that would be very rare.”
Another good sign was the sturdiness of the DNA storage system. According to Agilent’s Emily Leproust, who helped synthesize the data into DNA, the DNA, which looked “like a tiny piece of dust”, can last for at least 10.000 years.
“We’ve created a code that’s error tolerant using a molecular form we know will last in the right conditions for 10,000 years, or possibly longer. As long as someone knows what the code is, you will be able to read it back if you have a machine that can read DNA,” Goldman said.
Though technically speaking, the study involved less than a megabyte of data in total, this is already a scalable result, a few orders of magnitude better than previous studies – and the advantages of DNA over both printed text and traditional hard drives are numerous – it is stable for very long periods of time, it requires no power, which makes it easy to transport and maintain, and most of all, it can cary larger amounts of data than the alternative.
Via The Conversation