Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Lior Pachter

@lpachter

Jan 19, 2023 • 23 tweets • 13 min read • Read on X

Scrolly

@MariaCarilli

Interested in "integrating" multimodal #scRNAseq data? W/ @MariaCarilli, @GorinGennady, @funion10 & Tara Chari we introduce biVI, which combines the scVI variational autoencoder with biophysically motivated bivariate models for RNA distributions. 🧵 1/
biorxiv.org/content/10.110…

https://twitter.com/anshulkundaje/status/1417648380801556486

One of the clearest cases for "integration" is in combining measurements of nascent and mature mRNAs, which can be obtained with every #scRNAseq experiment. Should "intronic counts" be added to "exonic counts"? Or is it better to pick one or the other?

https://twitter.com/anshulkundaje/status/1417648380801556486

This important question has been swept under the rug. Perhaps that is because it is inconvenient to have to rethink #scRNAseq with two count matrices as input, instead of one. How does one cluster with two matrices? How does one find marker genes with them? 3/

@adamgayoso

One approach could be to adapt a method such as totalVI (@adamgayoso et al.), which is an integration method built on scVI for CITE-seq data. nature.com/articles/s4159…
Such a method could take as input two matrices, but would not utilize the biological relationship between them. 4/

There is a way. Figure 1 of our preprint summarizes the idea. But first let's talk about adapting scVI to work like totalVI, except for nascent and mature mRNA counts. This is what is shown in panel (a) of our Figure 1. 5/.

"Generative" means that scVI learns (via a neural network) parameters for negative binomial distributions modeling gene counts. This is useful in practice (see nature.com/articles/s4158…) however the use of negative binomial distributions reflects a supposition about the data. 6/

The supposition is absent a mechanistic rationale, i.e. there is no interpretation to the negative binomial distributions, the merely label a black box. One may not care to make the black box transparent, though giving it meaning is necessary for "integration". But how? 7/

@GorinGennady

@GorinGennady has been thinking about the bursty model of transcription and how it relates #scRNAseq data for several years (see, e.g. sciencedirect.com/science/articl…). His idea was to use a variational autoencoder to parameterize distributions arising from a CME. 8/

The model we used in depicted in panel (b). A telegraph model driving bursty transcription generates nascent mRNAs that are processed and subsequently degraded. 9/

This brings us to biVI, shown in panel (c). We replace estimates of negative binomial parameters for each gene, with mechanistically motivated parameters based on the model described in panel (b). The scVI black box is now an interpretable, open, transparent box. 9/

The math underlying this method is not easy, and requires solving the chemical master equation for a non-trivial model. We did this using another neural network (RHS of panel b). This neural network takes as input parameters and outputs steady state distributions. 10/

@MariaCarilli

A detailed description of how this works is in another preprint with @MariaCarilli and @GorinGennady.

https://twitter.com/lpachter/status/1537853444140126208

In fact, the biVI application motivated the neural network CME solver. We believe it will find many other applications. 11/

To validate our implementation we first tested it on simulated data, where the sim is not just of counts matching a distribution, but a mechanistic sim of nascent and mature mRNAs. The "sanity check" worked well. biVI is much better at recovering the ground truth than scVI. 12/

Notably, "scVI" here is the adaptation of totalVI framework to work with two count matrices. I.e., to be fair, we are feeding scVI both matrices. Current common practice is to use scVI with one count matrix. Even so, while scVI performed reasonably, it's better to use biVI. 13/

@sinabooeshaghi

What about biological data? We ran biVI on the BICCN primary motor cortex data that we analyzed in a publication last year, i.e. @sinabooeshaghi et al.: nature.com/articles/s4158….
14/

The information biVI outputs is interesting on several levels. Our Fig. 3 shows that distinct cell types are separated by the inferred parameters, and that novel markers can be detected (that are blind to an analysis even with two matrices, but that is not mechanistic). 15/

@GoogleColab

BTW, running biVI is straightforward. We demonstrate how to do it in a @GoogleColab notebook contributed by @funion10: github.com/pachterlab/CGC…

The repository contains the code for biVI, and also notebooks to generate the figures and results in the preprint. 16/

Note that in addition to implementing the bursty model, we also implemented the constitutive model, and an "extrinsic model" which production rates are random (from a Gamma distribution). 17/

In summary, we propose an answer to how one should use "intronic" and "exonic" counts together in a #scRNAseq analysis. There is much more to do: biVI can be extended to include a linear decoder for interpretability of the latent space, as in LDVAE academic.oup.com/bioinformatics… 18/

@MariaCarilli

The underlying mechanistic model can be extended to account for different technical artifacts, as well as to include other modalities, for example protein quantifications as in totalVI. @MariaCarilli will be pursuing these ideas in the future. 19/

@kreldjarn

There is more work to be done on how to accurately count nascent* and mature mRNAs. A recent preprint w/ @kreldjarn, @DelaneyKSull, @GuillaumOleSan & @pmelsted addresses this with more to come soon. 20/ biorxiv.org/content/10.110… *this terminology simplifies the underlying biology.

@MariaCarilli

Finally, a shout out to @MariaCarilli and @GorinGennady who led the project, and with whom it has been a pleasure to work with, and learn from. 21/21

https://twitter.com/GorinGennady/status/1615898214661750785

For a chemical engineering perspective on this work see

https://twitter.com/GorinGennady/status/1615898214661750785

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @lpachter

Lior Pachter

@lpachter

Dec 22, 2025

Exhibit 290 in the Epstein files raises some questions about whether he was truly a "source of intellectual exchange and stimulation."

The document is here:

Thread on what is in this document below: 1/🧵justice.gov/multimedia/Cou…

Epstein the racist:

"they [top mathematicians] don't exist in China because you need a bit of a creative person as opposed to simply a copy cat."

Also this take is not only racist but also so so so stupid. Hua Luogeng literally rolling in his grave.

2/🧵

Epstein's embarrassing advice:

"Algebra is not as important as it used to be. Programming is important."

This has aged about as well as milk.

3/🧵

Read 13 tweets

Lior Pachter

@lpachter

Jul 1, 2025

This week in academia, a not so short🧵...

1. Staff reductions and other cost cutting measures coming to Brown University highereddive.com/news/brown-uni…

2. Michigan State University staff cuts lansingstatejournal.com/story/news/loc…

3. Indiana University eliminating or suspending academic programs
bloomingtonian.com/2025/06/30/ind…

Read 17 tweets

Lior Pachter

@lpachter

Sep 10, 2024

So this plagiarism thing has happened to our lab.. again. This time it's plagiarism of our poseidon syringe pump paper @booeshaghi et al., 2019 in @SciReports:
Text has been plagiarized, as well as figures copied directly here: 1/🧵nature.com/articles/s4159…
ijirset.com/upload/2024/ma…

Here is figure 1 from our paper (LHS) and figure 1 in the plagiarized paper (RHS) published in the "International Journal of Innovative Research" 2/ ijirset.com/upload/2024/ma…

The text seems to have been rewritten with an LLM. Our introduction (LHS) vs. the plagiarized version (RHS): 3/

Read 11 tweets

Lior Pachter

@lpachter

Aug 16, 2024

https://twitter.com/SnyderShot/status/1823814971761025451

I've checked this paper out, as instructed. I was also interested in the main result for personal reasons: I'm 51 years old. Is it true that I've just gone through a major change? And that another one awaits me in just a few years?

Some comments on the paper in this thread 1/🧵

https://twitter.com/SnyderShot/status/1823814971761025451

The main result about major changes in the mid 40s and 60s is shown in this plot (Fig. 4a). First, I redrew it with axes that start at 0, so the scale of change here was clearer. Not as impressive, but maybe it's a thing? 2/

The authors say that this finding is even corroborated in another study (ref 14). But that's not true. I looked it up, and it shows something totally different (see RHS Fig 3c from ref 14). No change in mid 40s, but a change in the mid 30s, and the real change in the 80s 😕 3/

Read 17 tweets

Lior Pachter

@lpachter

Aug 10, 2024

https://x.com/lpachter/status/1814716374847197354

I recently posted on @bound_to_love's work quantifying long-read RNA-seq. In response, a scientist acting in bad faith (Rob Patro @nomad421) trashed our work. This kind of mold in science's bathroom is extremely damaging so here's a bit of bleach. 1/🧵

https://x.com/lpachter/status/1814716374847197354

At issue are benchmarking results we performed comparing our tool, lr-kallisto, to other programs including Patro's Oarfish. Shortly after we posted our preprint Patro started subtweeting our work, claiming we'd run an "appallingly wrong benchmark" and that we're "bullies". 2/

This was followed, within days, by Patro posting a hastily written preprint disguised as research work on benchmarking, but really just misusing @biorxivpreprint to broadcast the lie that our work "... may be repeatable, but it appears neither replicable nor reproducible." 3/

Read 25 tweets

Lior Pachter

@lpachter

Aug 1, 2024

This recently published figure by @Sarah_E_Ancheta et al. is very disturbing and should lead to some deep introspection in the single-cell genomics community (I doubt it will).

It demonstrates complete disagreement among 5 widely used "RNA velocity" methods 1/

This is of course no surprise. In "RNA velocity unraveled" by @GorinGennady et al. in @PLOSCompBiol we wrote 55 page paper explaining the many ways in which RNA velocity makes no sense. 2/ journals.plos.org/ploscompbiol/a…

We're not the only ones to understand how flawed RNA velocity is. The paper from the groups of @KasperDHansen and @loyalgoff is titled "pumping the brakes on RNA velocity". The whole notion of putting arrows on UMAPs is ridiculous. 3/genomebiology.biomedcentral.com/articles/10.11…

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Lior Pachter

Try unrolling a thread yourself!

More from @lpachter

Lior Pachter

Lior Pachter

Lior Pachter

Lior Pachter

Lior Pachter

Lior Pachter

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!