Lior Pachter Profile picture
Jan 19 23 tweets 13 min read
Interested in "integrating" multimodal #scRNAseq data? W/ @MariaCarilli, @GorinGennady, @funion10 & Tara Chari we introduce biVI, which combines the scVI variational autoencoder with biophysically motivated bivariate models for RNA distributions. 🧵 1/
biorxiv.org/content/10.110…
One of the clearest cases for "integration" is in combining measurements of nascent and mature mRNAs, which can be obtained with every #scRNAseq experiment. Should "intronic counts" be added to "exonic counts"? Or is it better to pick one or the other? 2/
This important question has been swept under the rug. Perhaps that is because it is inconvenient to have to rethink #scRNAseq with two count matrices as input, instead of one. How does one cluster with two matrices? How does one find marker genes with them? 3/
One approach could be to adapt a method such as totalVI (@adamgayoso et al.), which is an integration method built on scVI for CITE-seq data. nature.com/articles/s4159…
Such a method could take as input two matrices, but would not utilize the biological relationship between them. 4/
There is a way. Figure 1 of our preprint summarizes the idea. But first let's talk about adapting scVI to work like totalVI, except for nascent and mature mRNA counts. This is what is shown in panel (a) of our Figure 1. 5/.
"Generative" means that scVI learns (via a neural network) parameters for negative binomial distributions modeling gene counts. This is useful in practice (see nature.com/articles/s4158…) however the use of negative binomial distributions reflects a supposition about the data. 6/
The supposition is absent a mechanistic rationale, i.e. there is no interpretation to the negative binomial distributions, the merely label a black box. One may not care to make the black box transparent, though giving it meaning is necessary for "integration". But how? 7/
@GorinGennady has been thinking about the bursty model of transcription and how it relates #scRNAseq data for several years (see, e.g. sciencedirect.com/science/articl…). His idea was to use a variational autoencoder to parameterize distributions arising from a CME. 8/
The model we used in depicted in panel (b). A telegraph model driving bursty transcription generates nascent mRNAs that are processed and subsequently degraded. 9/
This brings us to biVI, shown in panel (c). We replace estimates of negative binomial parameters for each gene, with mechanistically motivated parameters based on the model described in panel (b). The scVI black box is now an interpretable, open, transparent box. 9/
The math underlying this method is not easy, and requires solving the chemical master equation for a non-trivial model. We did this using another neural network (RHS of panel b). This neural network takes as input parameters and outputs steady state distributions. 10/
A detailed description of how this works is in another preprint with @MariaCarilli and @GorinGennady.

In fact, the biVI application motivated the neural network CME solver. We believe it will find many other applications. 11/
To validate our implementation we first tested it on simulated data, where the sim is not just of counts matching a distribution, but a mechanistic sim of nascent and mature mRNAs. The "sanity check" worked well. biVI is much better at recovering the ground truth than scVI. 12/
Notably, "scVI" here is the adaptation of totalVI framework to work with two count matrices. I.e., to be fair, we are feeding scVI both matrices. Current common practice is to use scVI with one count matrix. Even so, while scVI performed reasonably, it's better to use biVI. 13/
What about biological data? We ran biVI on the BICCN primary motor cortex data that we analyzed in a publication last year, i.e. @sinabooeshaghi et al.: nature.com/articles/s4158….
14/
The information biVI outputs is interesting on several levels. Our Fig. 3 shows that distinct cell types are separated by the inferred parameters, and that novel markers can be detected (that are blind to an analysis even with two matrices, but that is not mechanistic). 15/
BTW, running biVI is straightforward. We demonstrate how to do it in a @GoogleColab notebook contributed by @funion10: github.com/pachterlab/CGC…

The repository contains the code for biVI, and also notebooks to generate the figures and results in the preprint. 16/
Note that in addition to implementing the bursty model, we also implemented the constitutive model, and an "extrinsic model" which production rates are random (from a Gamma distribution). 17/
In summary, we propose an answer to how one should use "intronic" and "exonic" counts together in a #scRNAseq analysis. There is much more to do: biVI can be extended to include a linear decoder for interpretability of the latent space, as in LDVAE academic.oup.com/bioinformatics… 18/
The underlying mechanistic model can be extended to account for different technical artifacts, as well as to include other modalities, for example protein quantifications as in totalVI. @MariaCarilli will be pursuing these ideas in the future. 19/
There is more work to be done on how to accurately count nascent* and mature mRNAs. A recent preprint w/ @kreldjarn, @DelaneyKSull, @GuillaumOleSan & @pmelsted addresses this with more to come soon. 20/ biorxiv.org/content/10.110… *this terminology simplifies the underlying biology.
Finally, a shout out to @MariaCarilli and @GorinGennady who led the project, and with whom it has been a pleasure to work with, and learn from. 21/21
For a chemical engineering perspective on this work see

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Lior Pachter

Lior Pachter Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @lpachter

Jan 2
This flippant comment on #scRNAseq algorithms reflects a common disrespect for computational biologists who are frequently derided for not asking "good biological questions". Moreover, it is peak chutzpah. A short 🧵..
As pointed out by @RArgelaguet, the OP recently coauthored a paper where many #scRNAseq methods, algorithms, and tools were used.. I wonder which of them the OP would have preferred was not developed. @AMartinezArias, please choose from this list:
Read 27 tweets
Dec 22, 2022
You have to hand it to Lex Fridman. His grift is not an amateur job. Take his Twitter photo. A professor standing in front of a blackboard with some math. Right?
This photo (see RHS of image below) is from what he calls his "MIT course" on Deep Learning for Self-Driving Cars. Sounds like good stuff. CS, math, self driving cars. #broheaven. So what is the problem? He is standing in front of the blackboard.
Well first of all, this was an MIT IAP class. IAP is a short period in January when students get to take fun classes on various topic that can be taught by anyone (many by students). I once sat in on a brain dissection. You can learn how to count cards. web.mit.edu/willma/www/mit…
Read 9 tweets
Dec 20, 2022
One of the Christmas stories that's almost completely forgotten is that of Bogdanovka. Starting on Dec. 21, 1941, tens of thousands of Jews were murdered in a camp in what is present day Ukraine by Romanian soldiers, ethnic Germans & Ukrainian police. 1/5

timesofisrael.com/romanias-homeg…
Large groups of hundreds of Jews were marched to a ravine in a forest and shot. Those who couldn't manage the march were murdered by the thousands in pigsties in which they were locked that were set on fire. Many others froze to death. degruyter.com/document/doi/1… 2/5
A poignant detail: the murderers took a break to celebrate Christmas, but resumed three days later, having killed ~40,000 people in time for New Years celebrations. 3/5
Read 5 tweets
Dec 19, 2022
So I got an email from a place called RayTech Group offering me $10,000 per month to write recommendation letters for their students who "need apply universities" with the stipulation that "the content of the recommendation letters can be further discussed" with me.
Their website (raytech.group) says that they achieve "students' comprehensive 'background development'" and "enhance the competitiveness of students at school age. As soon as possible to achieve the Offer from famous universities and enterprises."
This place has a website raytech.group/our.aspx?Class…
with photos of various people affiliated with @Stanford university. E.g. gsb.stanford.edu/faculty-resear…
Read 13 tweets
Dec 4, 2022
In a new preprint w/@kreldjarn, @DelaneyKSull, @GuillaumOleSan & @pmelsted we address a shortcoming in current approaches to quantifying single-nucleus RNA-seq: biorxiv.org/content/10.110…
tl;dr care has to be taken in quantifying nascent vs. mature transcripts. #snRNAseq #scRNAseq🧵
First, note that single-cell RNA-seq data provides some quantification not only of processed (mature) messenger RNAs, but also of nascent molecules. That observation by @GioeleLaManno, @slinnarsson and colleagues underlies RNA velocity. But what about single-nucleus RNA-seq? 2/
Similarly to #scRNAseq, #snRNAseq data has reads derived from both nascent and mature transcripts. However, perhaps due to intuition that #snRNAseq is all nascent, current approach to quantification yield a single count matrix based on all reads mapping to every gene locus. 3/
Read 16 tweets
Nov 18, 2022
The "genius" @elonmusk was a year ahead of me @PtaBoysHigh. Every year they gave awards to those "worthy of praise" (digni laude). Perhaps it's not a big surprise that Musk was never deemed "worthy of praise".
Which is fine, of course. Awards like this are rubbish anyway. But friendships aren't, and the people I miss from school, and there are many... are not Musk (in this photo he is front row left).
Nowadays @elonmusk likes to cultivate his genius myth... here his mom gives him an assist. businesstoday.in/latest/trends/…
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(