From Genes to Gigabytes: Data Storage in DNA

March 2025

By Elizabeth S

Edited by Manal Aditi


Imagine all the information held in the Library of Congress. How much space would be needed to store all that data?  Now, imagine trying to stuff all of that into an object the size of a sugar cube. That’s the scale – if not more – at which nature has been encoding information since the beginning of life. Forget clunky hard drives and massive data centers; scientists are turning to these very building blocks of life as the key to storing large-scale data in a space smaller than a grain of sand, lasts for millennia, and doesn’t crash when you spill coffee on it. Welcome to the world of DNA data storage, where the science of life meets big data.

Why use a biological molecule to store our most valuable data? For one, DNA allows for a high coding density where one gram of DNA could potentially store 215 petabytes (Imburgia & Nivala, 2024). To put things into context, large computer hard drives can store up to 0.001 petabytes; switching to DNA as a storage medium could exponentially skyrocket data capacity, revolutionizing storage technology. Moreover, DNA can last for thousands of years – and given its fundamental role in life, it is likely to remain extremely relevant and in use for many foreseeable future years to come. This makes DNA an ideal candidate for long-term archival storage, as long as it is properly preserved.

In the realm of scientific discovery, DNA-based data storage is still a relatively new concept. Proposed in the 1950s by physicist Richard Feynman as a method for high data storage density and long-term stability, scientists have been exploring its practical applications for several decades. Early attempts at DNA data storage were painstakingly slow and complex. The process involved encoding information by stringing together sequences of nucleotide building blocks adenine (A), thymine (T), cytosine (C), and guanine (G) together like beads on a thread, creating a pattern similar to the zeroes and ones used in binary code (Notman, 2024). These DNA sequences were then replicated multiple times and decoded using sequencing technology to read the stored information.

Fortunately, as scientific advancements progressed over the decades, researchers developed various techniques to move DNA data storage closer to commercialization. One notable breakthrough occurred in 2020, where a Harvard-based team led by George Church successfully digitized music pulled from a Super Mario Brothers game into DNA code by a light-controlled process, retrieving and re-storing it (Nadis, 2025). This same group also pioneered a method for synthesizing multiple data-coding DNA strands simultaneously, significantly increasing efficiency. More recently, a research group from Peking University in Beijing, developed an innovative approach that encodes DNA data using a coding machine equipped with pre-designed DNA templates and movable DNA  ‘building blocks’ (Zhang, 2024). With this system, even individuals with no background in biology experience can program DNA with data of their with data of their choice! While challenges remain, these advancements signal a promising future for DNA data storage.

All in all, as our digital universe expands and traditional data storage methods approach their economic and environmental limits, DNA presents an exciting alternative. As nature’s efficient, compact storage system, its applications are staggering. Yet, challenges remain— from high costs to errors in encoding and decoding speeds, meaning further refinement is necessary before widespread adoption. Nevertheless, breakthroughs in synthetic biology and sequencing are bringing us closer to this possible reality where data storage is written within the very fabric of life itself.

References

  1. Imburgia, C., & Nivala, J. (2024). ‘Do-it-yourself’ data storage on DNA. Nature Communications, 634.

  2. Nadis, S. (2025, January 25). Using DNA for data storage. Harvard Magazine. https://www.harvardmagazine.com/science-technology/harvard-using-dna-to-store-data-technology

  3. Notman, N. (2024, September 20). Is DNA the future of digital data storage? Chemistry World. https://www.chemistryworld.com/features/is-dna-the-future-of-digital-data-storage/4019749.article

  4. Zhang, C. (2024). Parallel molecular data storage by printing epigenetic bits on DNA. Nature Communications, 634. https://doi.org/10.1038/s41586-024-08040-5

Next
Next

Healthcare Beyond Hollywood