back to article Altered carbon: Boffins automate DNA storage with decent density – but lousy latency

Scientists in the US, working alongside Microsoft, have managed to encode "hello" into a readable strand of synthetic DNA, using a fully automated data storage system. The unusual apparatus to perform this work was created as a potential first step to eventually bring the technology to data centres. The experiment is described …

  1. Martijn Otto

    Now that Microsoft can add code to your DNA

    They're adding a whole new dimension to the term "Blue Screen Of Death".

  2. Martin Summers Silver badge

    You can just imagine the marketing slogan now.

    AWS: Data Storage is in our DNA.

    I presume Gene therapy is required for any corruption?

    1. Killfalcon Silver badge
      Joke

      Do you think it'll handle relational databases?

      1. Stumpy

        ewww ... would you really want Jeff Bezos' DNA all over your data?

    2. Mike Moyle Silver badge

      "You can just imagine the marketing slogan now."

      "Carry the sum of human knowledge in the a pocket of your genes."

  3. Stoneshop Silver badge
    Thumb Up

    An entire data centre the size of a sugar cube?

    “Hey,” he said, “is that really a piece of fairy cake?”

    He ripped the small piece of confectionery from the sensors with which it was

    surrounded.

    “If I told you how much I needed this,” he said ravenously, “I wouldn’t have

    time to eat it.”

    He ate it.

  4. Andre Carneiro

    Presumably a molecule of any nucleotide is a lot larger than one electron in, say, a spinning platter on an HDD

    How is it that DNA encoding has densities orders of magnitude higher?

    1. Charles 9 Silver badge

      Because you have to consider the other electrons in there (which naturally repel each other). With DNA storage, you use the entire molecule, not just the most space-consuming part of it.

    2. Jimmy2Cows

      HDD don't store one bit per electron. Each bit is many, many nanometers wide and long. That's a great many electrons per bit.

      1. Cynic_999 Silver badge

        Conventional HDDs don't use electrons to store data anyway. There's the same number of electrons in a bit of magnetised rust as there was when it was not magnetised.

    3. Filippo

      The platter is 2D. You can stack platters, but that's just a bunch of 2D areas on top of each other, with relatively huge distances between them that don't hold any data. DNA can actually fill a volume with data.

      That said, I strongly doubt that volume can scale much above the microscopic. If you actually packed an exabyte worth of DNA in the volume of a sugar cube, I suspect it would become too entangled to read.

    4. Mat 6
      Boffin

      Natural DNA has 4 nucleotides ATGC at any position so rather than base 2 (binary) it encodes in Base 4 (Quaternary). Hachimoji DNA for example has 8 nucleotides ATGCPZBS so goes to base 8. This is without additional modifications. ASCII in binary each letter takes 8 electrons in base 8 it would take 3 nucleotides. Physically DNA is several orders of magnitude larger then electrons on a HDD but multiple possible outcomes at each position increase its storage density.

      Then DNA is also able to pack itself together in 3D incredibly tightly. Each of our nuclei fits 3 billion nucleotides into a space just 6 microns across around 113 cubic microns. And that packing still allows for read write access.

      1. Jonathan Richards 1

        Base of bases

        Natural DNA has 4 nucleotides ATGC at any position so rather than base 2 (binary) it encodes in Base 4

        Well, not exactly, AIUI. The encoding in DNA is in terms of base pairs. The pyrimidines (cytosine and thymine) pair with the purines (guanine and adenosine, respectively). The two strands of DNA are complementary, and the ribosome reads the code by unlinking the strands and generating the complementary RNA. In nature, the base pairs are read as triplets, with a mapping between the possible triplet patterns and the exact amino acid to be added to the growing protein product.

        One doesn't have to replicate (ha!) this triplet encoding for non-biological purposes, but still at any given position on the double strand of "natural", i.e. ATCG DNA, there is either an AT pair, or a GC pair. I think that makes natural DNA a binary information store, unless there is a clever way to read only one specific strand.

        Further reading on Chargaff's Rule, etc.: Principles of Biochemistry

        1. Voyna i Mor Silver badge

          Re: Base of bases

          Sadly you are incorrect. Binary encoding would allow only 8 amino acids. There are 23, two of which (tryptophan and methionine) have only a single representation. So base 4 encoding it is, in principle, but some amino acids have as many as 4 alternative sequences. There are 3 stop triplets. It doesn't, to say the least, look "intelligently designed".

          1. Simon Harris Silver badge
            Flame

            Re: Base of bases

            "There are 3 stop triplets."

            Are there any HCF triplets?

  5. Doctor Syntax Silver badge

    There was no indication when if this will be put to commercial use.

    Apart from any other consideration let's think what DNA contains: a sugar (the "-ose" at the end or "ribose" is a clue), some nitrogen and some phosphorus. A selection of some of the most important elements in metabolism. A single bacterium could multiply itself through your sugar-cube size DC in a few hours so you end up mostly decoding bacterial DBA to convert it back into bits.

    1. Thoguht Silver badge

      My worry would be not so much that some micro-organism could damage the data store, but that the data store itself could be a micro-organism. Suppose someone's database when placed in this DNA storage just happened to encode for, say, an airborne virus with 100% lethality to human beings. Wouldn't that mean any large-scale DNA data stores would have to be kept inside the highest possible level of biosecurity?

      1. Killfalcon Silver badge

        That's not directly possible. DNA by itself does nothing - you need all the array of cellular components (ribosomes and mitochondria and al that stuff) to go from DNA to an organism.

        Basically, all DNA is is a recipe list for proteins. A cookbook cannot become a macdonalds and give everyone heart disease.

        There's an outside chance that a bacteria might get in there and read the database like it was it's own DNA (maybe? I guess? Seems unlikely, given that bacteria don't spontaneously become other bacteria when they eat 'em, as far as I know) but as already pointed out this stuff will need to be kept in suuuuper clean conditions to stop micro-scale wildlife eating it anyway.

        1. Simon Harris Silver badge

          "A cookbook cannot become a macdonalds and give everyone heart disease."

          According to my mum, when I was very small I used to lick her cookery books to see what the food was going to taste like.

          1. Rich 11 Silver badge

            If you want to see what the world's oldest recipe tastes like, there are several cuneiform clay tablets in museums around the world describing how to make beer. But I think you would have to ask the curator very politely if you wanted to lick one.

            1. Dr Dan Holdsworth Silver badge

              Of course the fun starts when someone discovers that some existing DNA looks quite like degraded data storage, runs a decode on it and uncovers extremely ancient data from before the K/T extinction event. Or, for that matter, from before the Great Oxygenation Event...

              1. Voyna i Mor Silver badge

                We have loads of DNA that originated from before K/T. Chordates were well advanced by then and the earliest mammals were around, from whom we are descended.

                I guess too we will have RNA from before oxygenation. Muscles can carry out anaerobic respiration for short periods and that may well originate from very ancient processes.

              2. Grinning Bandicoot

                Erich Frank Russell already went there in the mid 60s. As usual with his twist its humorous but leaving the strange mental aftertaste 'what if'.

        2. ibmalone Silver badge

          Seems unlikely, given that bacteria don't spontaneously become other bacteria when they eat 'em, as far as I know

          It's not quite the same thing, but bacteria can exchange DNA with each other horizontal gene transfer. Maybe it'd be wise to use an encoding scheme that prevented replicons appearing, but chances of random data encoding for something meaningful are likely to be in the bardic simian typist range.

        3. Charles 9 Silver badge

          Then how do prions operate, seeing as they're able to malform related proteins by their mere presence?

          1. Francis Boyle Silver badge

            There's a big difference between flipping a molecule to a new folded state and manufacturing a completely new protein, as big as say, kicking a wall over and building it from scratch.

          2. Jonathan Richards 1
            Boffin

            Sequence |== structure

            Yes, what Mr Boyle says. To illuminate further, a lot of a protein's ability to do what it does (mostly, they're enzymes) depends critically on its folded-up structure in 3-D. What the gene encodes for is simply its "2-D" amino acid sequence. It seems that prions can catalyze the transformation of specific proteins into a different, defective, folded structure which nonetheless has the same amino acid sequence.

    2. Fred Flintstone Gold badge

      You could also be at risk from staff being on a diet (or horses, but they tend to be relatively rare in data centres).

      :)

      1. Anonymous Coward
        Anonymous Coward

        horses in data centers

        Oh, so that's why that deer was running around the DC in that Reddit video, he was just ahead of the storage trends.

    3. Simon Harris Silver badge

      If it's sugar, why does it taste salty and slimy?

      https://boingboing.net/2017/10/14/what-does-dna-taste-like-an-i.html

      (yes, it is safe for work)

  6. Anonymous Coward
    Anonymous Coward

    I thought I was reasonably smart

    Or at least not stupid, but this work confuses me more than anything else I've read about recently.

    Writing data to DNA strands, I mean if thats not straight out of the future then nothing is.

    1. Korev Silver badge
      Boffin

      Re: I thought I was reasonably smart

      DNA synthesis machines have been available for years

    2. Anonymous Coward Silver badge

      Re: I thought I was reasonably smart

      Some of what used to be futuristic is now historic.

  7. Robert Helpmann?? Silver badge
    Childcatcher

    Basic vs Practical Research

    This medium would seem to have promise, but I doubt it will be used in a data center. It is never going to be fast, which means it would only be useful for long-term storage. There are a number of solutions, though, that will last longer than this is likely to and are most likely faster than it ever will be. I wouldn't look to this to revolutionize the storage world. At the same time, developing this might lead to new technologies that will allow us to do things we cannot today, haven't even considered... or not. The point, I think, is to learn what is possible first and figure out a real world use for it later. Good luck to the researchers!

  8. Queeg
    Trollface

    OMG

    The Data Center's been hacked.

    https://static.guim.co.uk/sys-images/Guardian/About/General/2012/6/12/1339470668498/sugar-cubes-006.jpg

    1. Fred Flintstone Gold badge

      Re: OMG

      Sweet!

  9. chris 143

    Reminds me of

    https://en.wikipedia.org/wiki/IBM_1360

    Impressive density (for the time) but useless for data that changes.

  10. ashdav
    Joke

    Decode the human genome

    It would be interesting to put the human genome through this,turn it into binary and see if it came up with 42.

    1. Voyna i Mor Silver badge

      Re: Decode the human genome

      Which human genome? We aren't all identical.

      ...checks...

      No, I definitely don't look like Simon Jones. I'm not at risk from passing mice.

  11. 89724102172714182892114I7551670349743096734346773478647892349863592355648544996312855148587659264921

    "Altered Carbon"

    The TV adaptation was a travesty.

  12. vtcodger Silver badge

    The University of Washington and Microsoft research team behind this latest experiment previously said DNA-based storage could fit the contents of an entire data centre into a sugar cube-sized unit ...an MIT startup called Catalog said it was designing a machine that could write a terabyte of data a day

    It's cute. And if it ever comes to fruition, I'm sure that there will be applications. But allow me to point out:

    . You can order sugar cube sized USB flash drives off the shelf today as well as 2TB units although to latter look to be closer to small coffee cup than sugar cube size.

    . Existing storage technologies are moving forward steadily with regard to capacity and speed. By the time DNA storage actually is usable, semiconductor competitors will likely be much more compact, faster, and greater in capacity than today.

    . DNA can develop defects -- cancer and mutations are the subject of huge amounts of medical research. Might want to include ECC in your DNA storage technology.

    . DNA is pretty sturdy. It doesn't last forever. But a few decades/centuries/millennia is probably good enough for most purposes. Here's a link to the Wikipedia article https://en.wikipedia.org/wiki/Ancient_DNA

    1. Pascal

      "DNA can develop defects -- cancer and mutations are the subject of huge amounts of medical research. Might want to include ECC in your DNA storage technology."

      I only very vaguely know what I'm talking about but DNA mutations only occurs on replication / during cell divisions or somesuch right, now "at rest" on its own? Or without I assume external influence like radiation etc. which this storage would be protected against?

      1. Doctor Syntax Silver badge

        As far as I can see reading it will destroy it so there'd be a need to replicate pre-read or create a new strand after reading. If I'm correct it poses a further issue. How do you verify what you wrote?

        1. vtcodger Silver badge

          How do you verify what you wrote?

          Just like a disk or cloud file or a stack of punched cards for that matter -- parity, checksum, ECC, hash, etc,etc,etc? Or maybe you can deep fat fry it, then taste it? If it tastes off, or makes you sick or causes your toenails to fall off, assume it has been corrupted.

  13. Scott Broukell

    So presumably in the (distant) future data centres will be able to send off small samples of their data sets and discover their ancestry !?

    1. Stumpy

      Isn't that called Blockchain?

  14. Smooth Newt
    Meh

    Long lasting

    I am sceptical about the "long lasting" claim. DNA is so fragile that almost none of it survives more than a few years even though it is almost never exposed to anything approaching an "extreme environment". Only a very, very tiny proportion of all the DNA created actually survives for thousands of years.

    1. DougS Silver badge

      Re: Long lasting

      Presumably you would keep your "data center in a sugar cube" in conditions ideal for the longevity of DNA, and since it is so small have lots of backups. The reason little survives is because DNA rarely ends up in ideal conditions.

      For bonus points, create a life form with a LOT of junk DNA you can use as scratch space, and put it in a zoo. With enough redundancy you can get your data back from one of its descendants :)

  15. Mike Moyle Silver badge

    I'm picturing a computer virus that codes an actual virus that unzips the DNA, wiping your storage.

  16. Great Bu

    Bad news, I'm afraid....

    Sorry sir, I'm afraid we are unable to access your data as it has hard drive cancer.....

  17. Voyna i Mor Silver badge
    Boffin

    Good news and bad news

    DNA storage will turn out to be perfectly technically feasible but the random access read time will be 9 months.

    1. Anonymous Coward
      Anonymous Coward

      Re: DNA storage will turn out to be perfectly technically feasible

      I actually tried this. I thought I would be able to store and print some DNA info, but it turned out the printer had it's own ideas and though I was just supplying the toner.

      The resulting hybrids will probably enable some information continuity, but I wouldn't really call them backups.

  18. Milton Silver badge

    Sounds bit 'pure science'

    I don't want to diss efforts like this—there's always something to be learned, and I am not a biologist—but: is this ever going to be a practical route to high-density data storage? There is so much work going on with silicon-based, plasmonic, photonic and holographic approaches, many of them offshoots of developing nanotech ideas, that it's arguably likely that we'll eventually be able to read/write data at the molecular scale anyway. It may well be a failure of the imagination on my part, but while I think humanity may do some remarkable things with DNA (both wonderful and horrible, almost certainly), routinely stuffing petabytes of data into a vial of the stuff for ve-e-ery slo-o-ow retrieval just doesn't seem that likely. Perhaps it'll be a niche product for spies and smugglers?

    (Didn't Friday have a special pouch concealed behind her navel ...?)

    1. Intractable Potsherd Silver badge

      Re: Sounds bit 'pure science'

      Up vote for the sneaky RAH reference!

  19. Cynic_999 Silver badge

    New language

    There will no longer be any fake news. Instead events will be genetically modified.

  20. Rol Silver badge

    Perplexing question?

    The article isn't clear as to how the data is read, but I'm guessing it is destructive so here's the dilemma.

    Do we seek a biological way to replicate the data before it is destructively read - introducing the possibility of a random mutation, or recreate it from the read data - introducing the possibility of non-random editing over the thousands of years to come.

    You know. I'm now looking at blockchain in a new light.

  21. Schultz

    Anybody cared to check...

    If they actually developed any new technology, or if they just strung together on DNA synthesizers with DNA sequencers, both of which are quite established technology and heavily automated / computer controlled already?

    Oris the fact that an article appeared in a renown scientific journal sufficient to swallow any hyperbole?

    Now, just to douse your enthusiasm a bit more, stop a moment to think on (1) what quantity of DNA molecules they synthesize, and (2) what quantity / volume of chemicals is required to write and read the information. Maybe factor that into the storage density equation before comparing it to actually functioning and commercially available data storage systems.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019