Humanity has been dealing with big data for the past few years, and there is no doubt that we will soon be running out of space where we can store them. Data has never been more important, especially in the digital-dependent age, which is why data capacity gap is a concern that must be addressed.

DNA is considered an ideal storage medium because it's ultra-compact and can last hundreds of thousands of years if kept in proper condition, but how can we unlock its potential?

Researchers Yaniv Erlich and Dina Zielinski from Columbia University and the New York Genome Center claim to have successfully stored, retrieved and replicated digital data on DNA using algorithm called "DNA fountain."

According to Science Times, the researchers compressed six files and put it in a master file. They then split the data into binary codes of ones and zeroes. They then translated the ones and zeroes to A, G, C and T, the nucleotide bases in DNA. The DNA sequence were then sent to San Francisco-based Twist Bioscience to convert it into biological data. Weeks later, they received a vial with the DNA molecules, which amazingly contained the files they stored, including a movie, an Amazon gift card and a computer virus, among others.

The researchers have also managed to view the files by translating the DNA back into binary codes of ones and zeroes and have managed to replicate the files without errors.

Christian Science Monitor notes that the key to making the coding efficient lies in what the researchers call the "DNA fountain." It allowed storing, replication and retrieval without losing key pieces of the code.Their paper published in journal Science notes that they were able to store 214 petabytes per gram of DNA.

Science Mag said that storing data on DNA was pioneered in 2012 by geneticist George Church. He and his colleagues were able to encode a 52,000-word book in thousands of snippets of DNA.

Meanwhile, the DNA synthesis is not available on large-scale yet since the process is quite expensive.
"Currently it's like $7,000 for two megabytes of data, but here's the thing to keep in mind: the $7,000 is for DNA molecules of very good quality, because the supply chain is geared toward synthetic biology applications," Erlich told The Scientist in an interview.