
IT Tips & Tricks
Beyond the Da Vinci Code: Storing Digital Data in DNA
And why it may be the answer to the looming digital crises
Published 5 March 2025
Prepare to have your mind blown.
Internet clickbait aside, is storing digital data inside DNA seriously a realistic possibility?
Go and take a quick look at yourself in a mirror or the camera on your phone or computer. Take a good, honest look at what you see because regardless of your opinion of yourself or that new haircut, you’re about to be deeply impressed and astonished … by you. Here’s why.
The Rapidly-Encroaching Problem with Digital Storage That Will Affect You (Unless You’re Over the Age of 75)
I have to start with an obvious fact: Everything from our computers, smart devices, TVs, thermostats, home security systems, cars, robots, and wearables such as watches, hearing aids and various medical monitors are all constantly generating and using data in amounts that increase exponentially, year after year.
The next fact is less known by the average human but is obvious to you (those in the data technology or IT profession): Currently, we gather, process and store data using microchips. These chips are primarily made of silicon, a substance derived from sand.

With the availability of computer-grade silicon under threat, what’s the alternative?
Now, here comes the part that even most of us in the IT and computing profession don’t yet know (or at least don’t know the full extent of): Despite being the second most abundant element in the Earth’s crust, pure silicon — the form necessary for computer chip manufacturing — is scarce, constituting less than 10% of the total global silicon supply.
This limited resource is being rapidly consumed. Studies suggest that the current data explosion could exhaust the world’s computer-grade silicon reserves by 2040. That gives us just 15 years, from the date of this article, to arrive at an alternative solution — or else.
To mitigate this potential crisis, researchers are exploring two primary approaches.
The first involves refining silicon extraction and processing techniques. One obvious problem with this solution is that it still involves relying on a limited Earth resource. I’m just being factual here. Earth’s resources are finite. So, we cannot continue to exponentially increase the depletion of certain of these resources for much longer before they start to actually drop to near zero levels. Pure silicon is one of these.
The 28 million petabytes you carry around in your cells is enough to keep Google busy for almost 4,000 years.
The second focuses on identifying and developing alternative materials for data processing and storage. And that’s where you come in.
The Stuff of Which You’re Made
Let’s review some things about DNA. Deoxyribonucleic acid is a molecule that carries all the genetic information for the development and functioning of an organism. That includes everything from you and all the wildlife on Earth down to a single-celled amoeba. DNA literally contains the blueprint for life.
Everything from your height and eye color down to the shape of your baby toe is determined by your DNA. Present in almost every cell of your body and shaped like a twisted ladder, your DNA is like an instruction manual for biological life.

The 42 billion miles of DNA in your body are equivalent to about 84,000 roundtrips to the moon.
Your body contains roughly 37 trillion cells, and your DNA, almost unbelievably, is coiled up in six-foot lengths inside almost every single one of those cells. If all the DNA coils in your body were stretched out end to end, they would total around 42 billion miles (“billion” with a “b” — roughly the distance of 84,000 round trips to the moon).
The human genome in one cell of one human body (yours, for example) — which is the complete set of genetic instructions it would take to build an exact replica of you — is roughly 750 megabytes in size in terms of data. Multiply that by your 37 trillion cells and the result is that you are carrying almost 28 quadrillion megabytes or 28 million petabytes or 28 zettabytes of data around with you — effortlessly.
What Does It Mean?
If you’re having difficulty visualizing what that means, you’re not alone. Let’s add some context that may help.
If you add up the entire written works of mankind since the beginning of recorded history — in all languages — it amounts to roughly 50 petabytes.
Google®, for example, handles around 20 petabytes of data every day. Effectively, the 28 million petabytes you carry around in your cells is enough to keep Google busy for almost 4,000 years.
Every single day, we Earthlings generate and store data on our computers, security cameras and in databases hosting every website and email account. Data is generated and stored by mega-million-dollar corporations, by every government on Earth, every military on Earth, every educational institution and research facility, every social media platform, broadcaster and news source, every financial institution and stock market, every mom-and-pop shop, every streaming service, space exploration project and hyperscale data center.
The data that is intended for storage in the DNA is run through an encoder that converts binary code into genetic code. (This sentence, alone, is jaw-dropping. Just amazing.)
That’s a ton of data, and, with computer-grade silicon supplies under threat, we need an ingenious solution for storing it all in the future.
You Magnificent Beast
There are currently approximately 64 zettabytes of data in the entire world. In your body alone, you’re carrying the ability to store almost 50 percent of that data. In other words, when you and a friend meet for lunch, the amount of DNA data storage capacity sitting at that table is almost sufficient to house all the data currently stored in every electronic storage device in the world — combined.
Can you think of anything else on Earth with that kind of data capacity? Neither can I.
The problem is that each year, the amount of data generated increases, with the total for 2025 predicted to reach around 181 zettabytes. With our collective data production increasing at an exponential rate and silicon supplies diminishing, how do we store it all? What happens if we are not ready when this crisis starts to consume us?
Human DNA: The G.O.A.T
While we’re responsible for producing this vast ocean of data, could we also be the solution to storing it? The answer, quite simply, is yes.
When you and a friend meet for lunch, the amount of DNA data storage capacity sitting at that table is almost sufficient to house all the data currently stored in every electronic storage device in the world — combined.
Human DNA is literally the greatest storage device known to man, which is why researchers are synthesizing it for digital data storage. That’s why an understanding of how it works is so important. There are issues and hurdles still to overcome before it can be used in an effective, efficient manner, but it may be the data storage system of the future. It may be the best solution to the fast-approaching chip crisis.
How does it work? How do we go from genetic data in our cells to digital data that can be stored and rapidly retrieved?
Not only does that classic double helix, or spiral ladder, contain every tiny detail that makes each one of us physically unique, but it’s precisely this structure itself — its very architecture — that inspired the development of new DNA storage devices.
The result is both delightfully simple and unimaginably complex: We’ve copied nature and produced synthetic DNA, which allows us to store mountains of data in a tiny blob of manmade material.
Let’s talk about the spiraling ladder. The vertical supports of the ladder could be considered opposites of one another. Think of them as male and female or positive and negative. Each vertical support of the ladder has a series of half-rungs attached to it — almost like a comb — and those half-rungs connect to the half-rungs of the opposite support to create the rungs of the DNA ladder.
In our bodies, the half-rungs consist of four building blocks: A (adenine), T (thymine), C (cytosine) and G (guanine). Each half-rung has space to store two digits. In the case of binary code, which is what we use to store digital data, those digits are expressed either as 00, 01, 10, or 11.
So, how much data can you squeeze into a blob of synthetic DNA?

All the movies ever made could be stored in a lump of synthetic DNA the size of a single die.
All the movies ever made could be stored in a quantity of synthetic DNA smaller than a sugar cube.
The entire English-language version of Wikipedia, for example, has already been stored in a tiny, amber-colored blob in a test tube. It’s real. Google it.
Harvard® scientists have calculated that a single gram of DNA can store approximately 215 petabytes of data. I know, crazy, right?
A container of DNA, roughly the size of two passenger vans, could hold all the data ever created in the world. Let that sink in for a moment. All the data ever created in the world. That means all the data on every private server on Earth, all the data in the approximately 11,000 data centers around the planet, all the data on humanity’s phones, computers and other devices, the totality of mankind’s historical data, all government and corporate data, every book, movie, song and symphony ever made, and everything on the internet.
Can you imagine all that data — compactly stored in DNA — neatly occupying just two parking spaces?
Shelf-Life is Everything
What about lifespan, you ask? The main problem with traditional storage options such as hard drives, magnetic tapes, CDs and DVDs is that they physically have relatively short lifespans.
Hard drives can fail in a matter of years, and magnetic tape — mostly used for backups and archives — has a best-case lifespan of only about 30 years but often needs to be replaced after just ten. If that sounds like it’s long enough, bear in mind that we’ve witnessed the rise and fall of the CD within a period of a mere 40 years. Most new computers don’t even come with CD drives anymore.
The entire English-language version of Wikipedia, for example, has already been stored in a tiny, amber-colored blob in a test tube.
But why do we need such a long-term storage solution? Well, in the last few years, mankind has generated more data than in our entire previous history. And the rate of data generation is only increasing.
With current storage options, as much as we add new data, we must also constantly make new backups of the old data or risk losing our entire record since traditional data storage options degrade over time. As a storage medium, DNA has the potential to simply outlast any other option on earth.
What’s impressive is the fact that our DNA can outlive us by thousands of years, providing it’s stored correctly. We discover all-but-forgotten DNA everywhere, whether in an ancient Egyptian burial chamber, the teeth of a Siberian mammoth that is over a million years old, or the remains of a woman who lived and died 45,000 years ago. That’s a pretty good shelf life.
Can you think of a modern storage device that offers that kind of longevity?
Reading Data from DNA
Okay, so we can cram massive amounts of data into synthetic DNA. But how do we access it?
A DNA sequencer can read this data just as easily as it can read the DNA from human remains found at a crime scene. It simply deconstructs the DNA, reads the binary values of the rungs and delivers the data we stored.
Admittedly, this is still a slow and expensive process (which is one of the current major hurdles to practical use), but that’ll change sooner or later. Doesn’t technology pretty much always accelerate exponentially? Digression: I remember when I got my first electronic calculator. While I could hold it using one hand, it didn’t actually fit in a single hand; it cost over $100 (and this was in the 1970s) and only did the four basic arithmetic operations. Just a few years later, I bought a Hewlett-Packard® scientific calculator that could do hundreds of operations, was half the size and less than half the cost.
Back to DNA storage, what about errors or data loss?
Sometimes, as IT professionals, we experience errors in data storage and retrievals, and DNA data storage currently has similar issues. These either arise from the DNA itself or occur during the process of sequencing. Fixes for this already exist, and as the technology improves, there will, no doubt, be improvements in terms of solutions.
The one factor in our favor is the exponential rate at which advancements in technology tend to snowball.
One current method for overcoming DNA data loss relies on fountain codes, which are built into the data that’s being stored. Here’s an over-simplified example: You want to store three digits — 4, 9 and 17 — and you want to be able to recover all three of them, even if one of them gets corrupted and can’t be read. If you store the sum of all three, which is 30, then if the 17 gets corrupted, you can calculate what it was by subtracting the 4 and the 9 from the 30.
The other fix is naturally built into the DNA itself and is quite remarkable. Remember the rungs we talked about? The A, T, C and G? Well, in biological DNA, the A always pairs with the T and the C always pairs with the G. Whatever appears on one side of the rung is always the opposite on the other.
So, if the binary code carried on one side is zero-zero, the other side will appear as one-one. If one side is zero-one, the other side will be one-zero. This means that if, for whatever reason, the positive and negative sides are ever torn apart, you can completely reconstruct the data of the missing side by simply reversing the coding on the side that you do have. Mother Nature is pure genius.
Cutting It Close?
Synthesizing man-made DNA and storing data in it currently costs around $3,500 per megabyte. Here’s a simplified overview of the process:
- The data that is intended for storage in the DNA is run through an encoder that converts binary code into genetic code. (This sentence, alone, is jaw-dropping. Just amazing.)
- Next, the synthetic DNA is manufactured with the genetic code written into it. Currently, this is a slow and expensive process and accounts for the bulk of the cost.
- To validate that the data has been successfully embedded in the synthetic DNA, the DNA is fed into a sequencing machine, which reads the order — or sequence — of the A, T, C and G.
- To access the stored data, specialized software is used to convert the A, T, C and G back to binary code, and the data can then be retrieved in digital form.
Researchers estimate that the cost could eventually drop to around $1 per terabyte in the future.
Scour the net, and predictions on when this technology will be viable for businesses or home use vary widely, with some estimates putting it within the next three to five years and others at ten to 20 years.

DNA has been extracted from the remains of a Siberian mammoth more than a million years old, proving the excellent shelf-life of DNA stored under correct conditions.
For me, the problem I have with statements like this is that they are too binary — meaning they seem to be based on the idea that one day it’s completely non-viable and the next day it’s viable, as though it’s like crossing one hard-coded finish line that’s the same for everyone. Those who predict three to five years probably have a different idea of what constitutes viability than those who predict it will take much longer. These forecasters are looking at different finish lines. Nonetheless, this kind of data at least gives us a rough approximation of a possible timeline.
We’re cutting it close when considering anticipated availability issues with chip-grade silicon (in only 15 years from the date of this article), but one factor in our favor is the aforementioned exponential rate at which advancements in technology tend to snowball.
DNA storage appears to be an utterly brilliant solution filled with unimaginable possibilities for our future. And that’s exciting.
Will you ever be able to look at your reflection again without considering the mind-blowing gift of your DNA and all the potential contained therein?
By Ed Clark
Recent Comments
- No recent comments available.
Leave a Comment