Tweet

Simon Barnett

16 Dec, 30 tweets, 9 min read

A recent publication by Dennis Lo et al applied long-read sequencing (LRS) in the prenatal screening (#NIPT) setting. It's a rather unorthodox technology/application pairing, and it's got me scratching my head a bit.

Open Acces Link:

pnas.org/content/118/50…

https://twitter.com/sbarnettARK/status/1303818412657909762

For context, earlier this year, Lo et al published a convolutional neural network ("the HK model") that enabled PacBio LRS devices to read methylation (5mC) across the entire genome with very high fidelity. This is important later.

What's methylation?

https://twitter.com/sbarnettARK/status/1303818412657909762

PDF of HK Model Paper:
pnas.org/content/pnas/1…

I'll summarize my main takeaways from the current paper and end with some of my open questions/concerns.

1.
The authors showed the presence of a large amount of long (>500 bp) cell-free #DNA in maternal plasma. We likely have been systematically underestimating the presence of long cfDNA because short-read #NGS is the predominant method used in NIPT.

1. (Cont.)

The fact that long cfDNA is present in these quantities is interesting by itself. While I'm not sure if this discovery will give rise to new diagnostic applications, it will undoubtedly result in new biological learnings. I'm excited about this.

2.

The authors discovered that sufficiently long cfDNA fragments (>1.8 kb) carry enough information that LRS instruments can determine the fragments' tissue-of-origin (TOO) at a single-molecule level. Using the HK Model, they showed a TOO AUC of 0.89.

2. (Cont.)

In English, this means that LRS can tell whether single molecules of DNA came from mom or baby. This could be helpful in the event that the cfDNA fragment harbors an informative mutation and we're trying to figure out whether the baby has inherited it.

3.

The ends of short and long (>500 bp) cfDNA fragments are unique. When I say 'ends', I mean the first and last four letters of each cfDNA fragment. These are called 4-mers and there are 256 unique permutations. See how different they are below:

3. (Cont.)

Okay, so what? Well, the authors hypothesized that changes in the relative abundances of these 4-mers could be used as a biomarker to detect a fairly serious pregnancy complication called preeclampsia. I'll explain the results before discussing preeclampsia.

3. (Cont.)

Based on the authors' classifier, they showed perfect discrimination (AUC = 1) between cases and controls, albeit in a very small patient population (n=20). Obviously this is a very good result, but I'm not resting much on it given the study size.

These were my three main takeaways from this paper (so far), though the authors did also show methods to deduce maternal inheritance and detect monogenic disorders. I'm fuzzier on whether there's a technical leap in these last two areas, so I'll hold off until I know more.

I'll switch gears to talk a bit about some potential advantages and obstacles with using LRS in the NIPT setting. I'd appreciate any and all feedback as I work through these.

First, I'll talk about preeclampsia. You can read some fast facts here:

marchofdimes.org/complications/…

While the disorder isn't rare, I'm not sure what the cost-utility of the test would be. It seems like the only remedies are low-dose aspirin or giving birth, which may or may not be an option depending upon the stage of pregnancy.

The current diagnostic paradigm seems lackluster (do you have high blood pressure, protein in your urine, or other non-specific symptoms after week 20)? While more accuracy/earlier detection would be great, these non-specific symptoms are SUPER cheap to detect.

Meanwhile, there's an up-and-coming (multianalyte) test being commercialized by Progenity called Preecludia which seems to have set the new standard for performance. Link below:

progenity.com/innovation/pre…

Another fact to consider is that LRS would add a LOT of cost to a test in the prenatal setting, which as a market is very price-sensitive, with many patient-pay fees hovering below $200. Based on the Lo et al paper, it seems a single sample would require a ~$2000 SMRT cell.

Then again, the method in the paper isn't optimized at all. In fact, most sequence reads were FAR below the optimal for PacBio sequencing (which is ~20kb). The molecules are way over-sequenced, which adds to cost, but only small returns on data quality.

Since the median length for cfDNA eligible for TOO analysis was ~1.8kb, I'd reckon an easy hack would be to re-run this protocol through PacBio's new programmable #concatenation method which stitches together small molecules into big loops:

biorxiv.org/content/10.110…

Even with this, I'm not quite sure about the economics of the preeclampsia application. Zooming out, though, what other applications might extend from these new discoveries. Might they support LRS having a presence in the NIPS/T market? Maybe!

Here's where I may need some outside input. Would a single-molecule method be able to have a lower limit of detection, that is, the ability to generate high-quality data earlier in pregnancy? Could one get away with more shallow coverage?

If the market is really only concerned with trisomies/aneuploidies, is there any real (market) benefit for being able to detect more monogenic disorders at an early stage? I'm not quite convinced this is the case (yet).

While the authors used PacBio, I see no reason why nanopore couldn't also be used here as the technique is fragment length independent. In fact, if the library consists of both short and long pieces of cfDNA, the prep may be easier on nanopore.

Then again, I'm not quite sure how important accuracy would be to this application, but I think it's probably feasible that both flavors of LRS could be used here.

To summarize, this paper shows that:

1. There's long cfDNA in maternal plasma and much more of it than we thought. What was an unknown unknown is now a known unknown.

2. LRS instruments can divine the tissue-of-origin at a single molecule level using methylation.

3. End motifs, especially those of long fragments, could be a potential avenue for biomarker discovery in the prenatal (#NIPT) setting.

Beyond that, this paper is more of a launching pad for further inquiry/investigation (in my opinion).

I wish I knew how important the small and large fragments were/are to the preeclampsia classifer. Could you get away with only building a large fragment library and not wasting PacBio ZMW's on tiny fragments? If so, that makes this more reasonable.

You can get 40 million (~1kb) CCS reads on one SMRT Cell 8M ($2,000) using programmable concatenation. The paper suggests ~2.4M CCS reads / sample, though only ~11% of those are >1kb.

That's 264,000 large cfDNA reads / sample?

So, 264,000 / 40 million = 151 samples / SMRT cell 8M or roughly $15 / sample (consumables only).

That seems reasonable(ish)?

Then again, that's just large fragments and doesn't fit the design of this paper.

Moreover, my understanding is that concatenation is harder and more error-prone with very small fragments. Perhaps this could be optimized for smaller fragments in the future, but anything below 1kb I'm thinking is ineligible to put into a concatenated molecule.

@GenomicsCow

@GenomicsCow Would really appreciate your thorts here re: carrier screening v. NIPS/T. This one isn’t clicking for me.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Read 10 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Simon Barnett

Try unrolling a thread yourself!

More from @sbarnettARK

Simon Barnett

Simon Barnett

Simon Barnett

Simon Barnett

Simon Barnett

Simon Barnett

Did Thread Reader help you today?

Like this author's thread?