In 2019 "Single-cell multimodal omics" was deemed @naturemethods Method of the Year, and since then many new multimodal methods have been published. But are there tradeoffs w/ multimodal omics?
There are a lot of ways to look at this question and we have much to say (long 🧵ahead!). As a starting point let's begin with our Supplementary Figure 4. This is a comparison of (#snRNAseq+#snATACseq) multimodal technology with unimodal technology. Much to explain here: 2/
(a) & (b) are showing the mean-variance relationship for data from an assay for measuring RNA and TAC (transposable accessible chromatin) in the same cells. The data is from ncbi.nlm.nih.gov/geo/query/acc.…
Cells from human HEK293T & mouse NIH3T3 were mixed. You're looking at the RNA. 3/
The mouse and human counts both display variance quadratic in the mean, consistent with negative binomial data. The quadratic coefficients are similar. This is also the case in (c) and (d) which are data from the same cell lines but with different technology called ISSAAC-seq. 4/
In (e) and (f) you see what unimodal data looks like. Same cell lines, but assayed with 10x Genomics #scRNAseq (the figures are reproduced from @const_ae and @wolfgangkhuber's recent preprint biorxiv.org/content/10.110…). Much less noise in unimodal data. 5/
Performing an analysis like this is difficult, because it requires apples-to-apples comparison. Currently, most multimodal assays are preprocessed with custom scripts or "pipelines" coupling together the equivalent of water pipes with electricity lines h/t @sinabooeshaghi . 6/
To perform like-to-like comparisons we had to develop new software that could be used on multiple different assays from different technologies. We focused for now on multimodal single-cell ATAC-seq + RNA-seq, and ended up building a program called snATAK on kallisto bustools. 7/
Now we could compare, say, ISSAAC-seq with SHARE-seq or SHARE-seqv2, or either of them to 10x Genomics Multiome. Or any of these assays to unimodal #scRNAseq or #snRNAseq or #snATACseq. We started by validating snATAK with the widely used Cell Ranger and Cell Ranger ARC tools. 8/
The first column is a comparison of snATAK to 10x's Cell Ranger ARC on 10x Multiome assayed PBMCs. The right column is a comparison of snATAK processing to Cell Ranger on a spatial ATAC-seq dataset (recently published by the @RongFan8 lab nature.com/articles/s4158…). 9/
With overall near identical results (although snATAK outperformed Cell Ranger on the spatial ATAC-seq data) we were ready to assess the multiome tradeoff, at least for ATAC-seq / RNA-seq (for now). BTW, snATAK is memory efficient, can run on @GoogleColab, and is fast. 10/
In a knee plot comparison of 10x ATAC-seq and the ATAC part of 10x Multiome you see that the multiome ATAC has an extra “knee” which is the result of a high load of cells resulting in doublets. In the relevant part, unimodal ATAC-seq outperforms its multiome counterpart. 11/
Multiome also suffers fewer reads per peak. Of course for these results datasets have been subsampled to the same depth. 12/
Back to the previous data, we performed comparisons of different technologies. There is a lot to unpack in the figure below. One technology has more doublets. But it also is much more efficient (at nuclei assayed / reads sequenced). Revealed thanks to uniform preprocessing. 13/
One of the useful features of snATAK is that it can perform allele-specific analysis. We used it to quantify the association between strand specificity in open chromatin, and strand specificity in expression. That's what you see here (w/ 10x Multiome PBMCs). 14/
In this plot each point is a cell type / SNP combination. The Alt / Ref on the x-axis is based on analysis of whether, in a cell type, the ATAC was open on the Ref or Alt strand only at a SNP. The y-axis is the corresponding Ref vs. Alt usage for gene expression. Makes sense. 15/
For this analysis the registration between RNA & ATAC is useful. We are sure that the same cells contribute both to the RNA and ATAC. However, while the result for cell types is convincing, we learn nothing about individual cells. The data is too sparse; a multiome tradeoff. 16/
In other words, here Multiome has produced a non-constructive existence proof. It's like asking for two numbers x and y such that x^y is rational, but x and y are both irrational. This is a seemingly hard problem. But... 17/
... we know that (√2^√2)^√2 = 2. Since √2 is irrational, if √2^√2 is rational we have an example. Otherwise one irrational number is √2^√2, and the other is √2, and we have an example. Existence proved. Not constructive. 18/
The code for reproducing the results described above, and for running snATAK, is here: github.com/pachterlab/BGP… 19/
There is much more to the multimodal tradeoff than is covered in our preprint: there are of course many other modalities to consider. But w/ snATAK (which can work whenever genome alignment is needed) & kallisto bustools we have shown that uniform preprocessing is possible. 20/20
So this plagiarism thing has happened to our lab.. again. This time it's plagiarism of our poseidon syringe pump paper @booeshaghi et al., 2019 in @SciReports:
Text has been plagiarized, as well as figures copied directly here: 1/🧵nature.com/articles/s4159… ijirset.com/upload/2024/ma…
Here is figure 1 from our paper (LHS) and figure 1 in the plagiarized paper (RHS) published in the "International Journal of Innovative Research" 2/ ijirset.com/upload/2024/ma…
The text seems to have been rewritten with an LLM. Our introduction (LHS) vs. the plagiarized version (RHS): 3/
I've checked this paper out, as instructed. I was also interested in the main result for personal reasons: I'm 51 years old. Is it true that I've just gone through a major change? And that another one awaits me in just a few years?
The main result about major changes in the mid 40s and 60s is shown in this plot (Fig. 4a). First, I redrew it with axes that start at 0, so the scale of change here was clearer. Not as impressive, but maybe it's a thing? 2/
The authors say that this finding is even corroborated in another study (ref 14). But that's not true. I looked it up, and it shows something totally different (see RHS Fig 3c from ref 14). No change in mid 40s, but a change in the mid 30s, and the real change in the 80s 😕 3/
I recently posted on @bound_to_love's work quantifying long-read RNA-seq. In response, a scientist acting in bad faith (Rob Patro @nomad421) trashed our work. This kind of mold in science's bathroom is extremely damaging so here's a bit of bleach. 1/🧵
At issue are benchmarking results we performed comparing our tool, lr-kallisto, to other programs including Patro's Oarfish. Shortly after we posted our preprint Patro started subtweeting our work, claiming we'd run an "appallingly wrong benchmark" and that we're "bullies". 2/
This was followed, within days, by Patro posting a hastily written preprint disguised as research work on benchmarking, but really just misusing @biorxivpreprint to broadcast the lie that our work "... may be repeatable, but it appears neither replicable nor reproducible." 3/
This recently published figure by @Sarah_E_Ancheta et al. is very disturbing and should lead to some deep introspection in the single-cell genomics community (I doubt it will).
It demonstrates complete disagreement among 5 widely used "RNA velocity" methods 1/
This is of course no surprise. In "RNA velocity unraveled" by @GorinGennady et al. in @PLOSCompBiol we wrote 55 page paper explaining the many ways in which RNA velocity makes no sense. 2/ journals.plos.org/ploscompbiol/a…
We're not the only ones to understand how flawed RNA velocity is. The paper from the groups of @KasperDHansen and @loyalgoff is titled "pumping the brakes on RNA velocity". The whole notion of putting arrows on UMAPs is ridiculous. 3/genomebiology.biomedcentral.com/articles/10.11…
Challenge accepted. Here are a few comments on the paper after starting to wade through its massive content. The paper in question is 1/🧵 nature.com/articles/s4158…
First, the claim that "lower OPC fraction across regions and, in particular, in non-neocortex regions was significantly associated with impaired cognition (Supplementary Fig. 37d)" is not true. Supp. Fig. 37d is below. I've boxed in red the panel the claim is based on. 2/
The R^2 value, i.e. proportion of variance explained is 0.0256. The "significance" claim is based on the reported p-value of 0.0071 which is less than 0.05. However significance vanishes once one corrects for the number of tests performed. 3/