Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Lior Pachter

@lpachter

Apr 5, 2024 • 25 tweets • 10 min read • Read on X

Scrolly

The choice of whether to use Seurat or Scanpy for single-cell RNA-seq analysis typically comes down to a preference of R vs. Python. But do they produce the same results? In w/ @Josephmrich et al. we take a close look. The results are 👀 1/🧵 biorxiv.org/content/10.110…

We looked at a standard processing / analysis summarized in the figure below. The sources of variability we explored are in red. The plots and metrics we assessed are in blue. We examined the standard benchmark 10x PBMC datasets, but results can be obtained for other data. 2/

Before getting into results it's important to note that Seurat has never been published, and many of the details of Scanpy are missing in its original paper. @Josephmrich read the code & traced every function and every parameter. E.g., this is how Clustering / UMAPs are made: 3/

There's a lot of talk about kNN graphs, but Seurat uses an SNN graph for clustering, whereas Scanpy uses (a different) SNN graph for both UMAP & clustering. The way kNN graphs are made also differs (and can depend on the number of cells being processed). More on this later. 4/

So let's jump to what people fixate on most: clusters & UMAPs. Starting with the same data and running Seurat and Scanpy with their defaults one gets the results below. Making this plot was non-trivial, as it required a matching algorithm to get the clusters / colors aligned. 5/

Seurat clusters are a more jumbled than the Scanpy ones (look at cluster 3 Seurat vs. 5 Scanpy). This is because Seurat uses different graphs for clustering & UMAP, whereas Scanpy uses the same.Based on this I've learned to tell whether a UMAP was made by Seurat or Scanpy 🙃. 6/

I now joke that if you want to make your PI happy, show them a Scanpy UMAP because the data will look cleaner. But of course this isn't funny. There can be completely different conclusions drawn from the two UMAPs both qualitatively and quantitatively .7/

A basic question is can Seurat and Scanpy be made the same, i.e. leaving aside the question about which is more correct, can parameters be set to get the programs to agree? @josephmrich did a detailed analysis of this. The answer is partly yes but overall no. 8/

Some functions agree with default params. Some can be made to be the same by matching arguments. In some cases (e.g. SNN / UMAP) it's impossible to get them to agree within the current implementations. Guides for how to make Seurat match Scanpy, or vice versa, are in the Supp. 9/

To understand the contribution of differences in each step to the overall divergence of the methods, we examined the output of each step with the exact same input. This is all in the supplement. This was important because the end result (markers) is *very* different. 10/

tl;dr there is a ton of detail that really matters. Differences started to be observed with PCA. They can be resolved (in the case of PCA), but it required really digging into the code to figure out how. Without fixing these differences, the PCAs don't match. 11/

Key differences start to emerge with how Seurat and Scanpy select highly variable genes (HVGs). Seurat’s default HVG algorithm is “vst” (equivalent to Scanpy’s “seurat v3” flavor), while Scanpy’s default HVG algorithm is “seurat” (equivalent to Seurat’s “mean.var.plot”). 👀 12/

It matters what the algorithms are, and they're totally different. Before asking which to use, it's useful to now what they are. Details are in the preprint. E.g. mean.var.plot/seurat fits a loess model to the variance and mean. Vst/seurat_v3 bins based on ranked mean. 13/

Versions also matter. A lot. Seuratv5 has changed how log-fold change is computed from Seuratv4. The difference to results are massive. This change was done to fix an error pointed out in preprints by @jeffreypullin & @davisjmc, and seperately by @LambdaMoses from our group. 14/

But the new fix is still problematic. @josephmrich again looked at the implementation, and there is now a dependence in the pseudocount on cluster size, which is weird. We explain this, in detail, in the preprint. 15/

There are too many other differences between Seurat and Scanpy to summarize here. I'll mention a seemingly minor one with major implications. They handles ties different when computing adjusted p-values. This results in major differences in reported p-values. 16/

Versioning is a major issue not just with Seurat & Scanpy. We also looked at Cell Ranger, which has changed its default for how it counts reads to produce the gene-count matrix. The change has major implications. I recommend sitting down before looking at the plots. 17/

Now some might say "ok, but I don't care..still found our biological result either way". That may be true, but then perhaps one should sequence less, or assay fewer cells. We asked how low one could go, and still have results whose differences is less than Seurat vs. Scanpy. 18/

The answers are below, broken down by procedure. If you don't care about the differences between Seurat and Scanpy, you might as well sequence 5% of the reads, or sacrifice a lot fewer mice and assay less 80% less cells. 19/

This is a key point. Nihilism in terms of software used and an addiction to not understanding (h/t Amos Tanay) is not just poor scholarship, it also leads to wasted (graduate student and postdoc) time, @NIH money, and lives of animals. The #scRNAseq field can do better. 20/

Thanks to @satijalab and @fabian_theis for making their Seurat and Scanpy packages open source. This work could not have been undertaken without that transparency. Our analyses are also open source and reproducible; the code is available at 21/github.com/pachterlab/RME…

This work began from initial investigations into the differences between log-fold-change calculations between Seurat and Scanpy that I looked at with Nicolas Bray, and which we wrote about in the Supplement here: 22/biorxiv.org/content/10.110…

twitter.com/GaalBernadett

@LambdaMoses also started to investigate differences in PCA, which was continued by the @pmelsted group. On the advice of we decided to go more in-depth and write a separate paper. @Josephmrich took on the task, and the manuscript is his work. 23/twitter.com/GaalBernadett

Aside from the comparisons between Seurat and Scanpy, their different versions, examination of Cell Ranger version differences etc., @Josephmrich's detailed description of Seurat and Scanpy's methods and associated parameters should be useful documentation for others. 24/

Finally, this work was truly a lab effort. Kayla Jackson, @NeuroLuebbert, @sinabooeshaghi, and @DelaneyKSull all had numerous and useful insights after slowly developing worries about Seurat vs. Scanpy over the years. I'll conclude with #methodsmatter 25/25

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @lpachter

Lior Pachter

@lpachter

Dec 22, 2025

Exhibit 290 in the Epstein files raises some questions about whether he was truly a "source of intellectual exchange and stimulation."

The document is here:

Thread on what is in this document below: 1/🧵justice.gov/multimedia/Cou…

Epstein the racist:

"they [top mathematicians] don't exist in China because you need a bit of a creative person as opposed to simply a copy cat."

Also this take is not only racist but also so so so stupid. Hua Luogeng literally rolling in his grave.

2/🧵

Epstein's embarrassing advice:

"Algebra is not as important as it used to be. Programming is important."

This has aged about as well as milk.

3/🧵

Read 13 tweets

Lior Pachter

@lpachter

Jul 1, 2025

This week in academia, a not so short🧵...

1. Staff reductions and other cost cutting measures coming to Brown University highereddive.com/news/brown-uni…

2. Michigan State University staff cuts lansingstatejournal.com/story/news/loc…

3. Indiana University eliminating or suspending academic programs
bloomingtonian.com/2025/06/30/ind…

Read 17 tweets

Lior Pachter

@lpachter

Sep 10, 2024

So this plagiarism thing has happened to our lab.. again. This time it's plagiarism of our poseidon syringe pump paper @booeshaghi et al., 2019 in @SciReports:
Text has been plagiarized, as well as figures copied directly here: 1/🧵nature.com/articles/s4159…
ijirset.com/upload/2024/ma…

Here is figure 1 from our paper (LHS) and figure 1 in the plagiarized paper (RHS) published in the "International Journal of Innovative Research" 2/ ijirset.com/upload/2024/ma…

The text seems to have been rewritten with an LLM. Our introduction (LHS) vs. the plagiarized version (RHS): 3/

Read 11 tweets

Lior Pachter

@lpachter

Aug 16, 2024

https://twitter.com/SnyderShot/status/1823814971761025451

I've checked this paper out, as instructed. I was also interested in the main result for personal reasons: I'm 51 years old. Is it true that I've just gone through a major change? And that another one awaits me in just a few years?

Some comments on the paper in this thread 1/🧵

https://twitter.com/SnyderShot/status/1823814971761025451

The main result about major changes in the mid 40s and 60s is shown in this plot (Fig. 4a). First, I redrew it with axes that start at 0, so the scale of change here was clearer. Not as impressive, but maybe it's a thing? 2/

The authors say that this finding is even corroborated in another study (ref 14). But that's not true. I looked it up, and it shows something totally different (see RHS Fig 3c from ref 14). No change in mid 40s, but a change in the mid 30s, and the real change in the 80s 😕 3/

Read 17 tweets

Lior Pachter

@lpachter

Aug 10, 2024

https://x.com/lpachter/status/1814716374847197354

I recently posted on @bound_to_love's work quantifying long-read RNA-seq. In response, a scientist acting in bad faith (Rob Patro @nomad421) trashed our work. This kind of mold in science's bathroom is extremely damaging so here's a bit of bleach. 1/🧵

https://x.com/lpachter/status/1814716374847197354

At issue are benchmarking results we performed comparing our tool, lr-kallisto, to other programs including Patro's Oarfish. Shortly after we posted our preprint Patro started subtweeting our work, claiming we'd run an "appallingly wrong benchmark" and that we're "bullies". 2/

This was followed, within days, by Patro posting a hastily written preprint disguised as research work on benchmarking, but really just misusing @biorxivpreprint to broadcast the lie that our work "... may be repeatable, but it appears neither replicable nor reproducible." 3/

Read 25 tweets

Lior Pachter

@lpachter

Aug 1, 2024

This recently published figure by @Sarah_E_Ancheta et al. is very disturbing and should lead to some deep introspection in the single-cell genomics community (I doubt it will).

It demonstrates complete disagreement among 5 widely used "RNA velocity" methods 1/

This is of course no surprise. In "RNA velocity unraveled" by @GorinGennady et al. in @PLOSCompBiol we wrote 55 page paper explaining the many ways in which RNA velocity makes no sense. 2/ journals.plos.org/ploscompbiol/a…

We're not the only ones to understand how flawed RNA velocity is. The paper from the groups of @KasperDHansen and @loyalgoff is titled "pumping the brakes on RNA velocity". The whole notion of putting arrows on UMAPs is ridiculous. 3/genomebiology.biomedcentral.com/articles/10.11…

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Lior Pachter

Try unrolling a thread yourself!

More from @lpachter

Lior Pachter

Lior Pachter

Lior Pachter

Lior Pachter

Lior Pachter

Lior Pachter

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!