My Authors
Read all threads
This new paper by @luisceze et al. is truly mind blowing and going to be a landmark in the DNA Storage domain.

biorxiv.org/content/10.110…

Why?🤔
A thread!
👇👇👇
DNA Storage has gained interest since the seminar work of @Nick_Goldman, @ewanbirney, @srikosuri, @geochurch in early 2013.

The main promise of DNA storage systems is that volumetric density of DNA is 4-5 orders of magnitudes better than traditional media.
But there are several problems with DNA Storage:
A. DNA as a medium is super expensive, about 5 orders of magnitude compared to traditional media.
B. Writing is freaking slow.
C. Reading bandwidth is super low. In fact, back in #AGBT18, we showed that the world sequencing...
...capacity yields about 1Petanuc/day, which is 25Terabyte per day. And this is assuming that we raided every genome center and got access to every Illumina machine.

OK: enough with the intro. Now to the paper!
@luisceze et al. highlighted another advantage of DNA storage: in memory computation. Tape/HD/SSD are cheap but are also pretty dumb. If you want to do any computation, you must move the data to your CPU/GPU because beyond data storage, they are pretty useless.
@luisceze et al. had a brilliant idea! They encoded pictures on DNA molecules so similar pictures will end up close to each other in the DNA sequence hybridization space 🤯.

How did they do that? First, they used a neural net to extract features from the pictures and encode
the features in DNA sequences. Then, they used another NN that learned the hybridization space and informed the first net whether similar photos hybridize together.

The sequences of the features are then synthesizes into oligos that also include the ID of the photo.
So how do we search?
Simple! Take your photo of interest and produce a biotin labeled oligo that represents it reverse complement. Let this oligo hybridize with your pool of DNA strands that represents the images (Impressively, they had 1.6M different images in the pool).
Get the oligo using magnetic beads with streptavidin and sequence the DNA oligos it captures.

They tried to find images of cats using this DNA search. The top results are below! More results in the paper, which is a must read.
So why this is so important?
Because DNA cannot compete on price with tape/CD/etc.

But if we can use DNA to move computation to data, we don't need to compete on price. Moreover, reading bandwidth does not matter, because in many data-intensive applications, we are interested
only in the final answer, which is usually a low dimension vector, rather than really reading all the data. Thus, if we can compute in memory, we simply don't care about reading the entire database with sequencers.

n/n
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Yaniv (((Erlich)))

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!