I like the reproducibility standards for machine learning in the life sciences by @autobencoder, @michaelhoffman, @markowetzlab, @suinleelab, @GreeneScientist & @stephaniehicks but I propose an additional platinum standard for one click reproducibility.1/
By "one click", I mean that the entire analysis be reproducible in a (free) interactive online session of @colab (or other similar service). All steps of the analysis, from downloading data to generating figures are then not only automated but accessible for users. 2/
For an example of what this entails and facilitates, see: pachterlab.github.io/CWGFLHGCCHAP_2… 3/
In some cases programs may be too resource intensive to run directly on "light cloud" such as @GoogleColab, but the output from those steps can then be loaded into @GoogleColab or equivalent making possible immediate exploration of results by users. 4/
The difference between "one command" and "one click" is substantial. While the former is a very high (& excellent) bar for reproducibility, it leaves the barrier of actually getting everything to run on suitable hardware. We've found that lowering that barrier is empowering. 5/
We started, as a lab, to learn how to move from gold standard to what I am calling platinum with nature.com/articles/s4158… by @JaseGehring et al. Getting this right has been challenging and we're still learning, but it's been worthwhile, we think, for others and ourselves. 6/
For other recent examples from our lab see 7/7

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Lior Pachter

Lior Pachter Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @lpachter

22 Sep
In response to questions & comments by @hippopedoid, @adamgayoso, @akshaykagrawal et al. on "The Specious Art of Single-Cell Genomics", Tara Chari & I have posted an update with some new results. Tl;dr: definitely time to stop making t-SNE & UMAP plots.🧵biorxiv.org/content/10.110…
In a previous thread I talked about the (von Neumann) elephant in the dimension reduction room: t-SNE & UMAP don't preserve local or global structure, they distort distances, and they are arbitrary. Almost everybody knows this but they are used anyway...
There were some interesting technical questions about our work. One question was the extent to which PCA pre-conditioning affects results. We examined this (Supp. Fig. 3). Tl;dr: it's time to stop making t-SNE & UMAP plots (with or without PCA pre-conditioning).
Read 20 tweets
27 Aug
It's time to stop making t-SNE & UMAP plots. In a new preprint w/ Tara Chari we show that while they display some correlation with the underlying high-dimension data, they don't preserve local or global structure & are misleading. They're also arbitrary.🧵biorxiv.org/content/10.110…
On t-SNE & UMAP preserving structure: 1) we show massive distortion by examining what happens to equidistant cells and cell types. 2) neighbors aren't preserved. 3) Biologically meaningful metrics are distorted. E.g., see below:
These distortions are inevitable. Cells or cell types that are equidistant in high dimension must exhibit increasing distortion as they increase in number. Actually, UMAP and t-SNE distortions are even worse (much worse!) than the lower bounds from theory.
Read 25 tweets
23 May
While it’s fun to banter about what constitutes a good lab, the part of this that is uncomfortable to discuss is that leaving a bad lab is in many cases near impossible. Few universities offer much support and PIs can and do retaliate, in some cases ending careers.
My first committee meeting of a biology student @UCBerkeley, when I was still a junior prof., resulted in a student breaking down in tears as he told us of abuse his advisor was inflicting on him. We brought this up with the advisor and department.
What happened? A few years later the professor was promoted to chair of the department.
Read 13 tweets
13 May
If you're working on spatial transcriptomics, I think you'll find @LambdaMoses' "Museum of Spatial Transcriptomics", which analyzes the field via its metadata, to be an incredibly useful resource. biorxiv.org/content/10.110… 1/11
The museum is organized as a main paper that provides an overview of a book (i.e. the Supplementary Material) which is based on a database of papers in the field compiled by @LambdaMoses. First the database... docs.google.com/spreadsheets/d…

It contains several hundred papers. 2/11
To undertake a comprehensive study of the field, @LambdaMoses read all these papers carefully, starting with "prequel" literature to establish historical context. The database has detailed metadata including a summary of each paper. This timeline is just of the prequel. 3/11
Read 11 tweets
13 Apr
Yesterday I posted a piece about @OrchidInc's polygenic embryo selection. I thought, based on a press release I read, that they were the first company to undertake polygenic embryo selection. 1/ liorpachter.wordpress.com/2021/04/12/the…
The press release started w/ "Orchid, the first preconception system to quantify how a couple's genetics impacts their future child's health, today announced a $4.5M seed round..". It went on to describe the company's polygenic embryo selection product. 2/ prnewswire.com/news-releases/…
I naïvely assumed that Orchid is the first company to embark on polygenic embryo selection, but TIL that is not the case. In fact, more than two years ago, an article in @TheEconomist discussed myome. 3/
Read 8 tweets
31 Mar
I have a few things to say about this tweet attacking @mbeisen and subtweeting me. Specifically, I want to talk about cancel culture gone mad... 1/14
In September I wrote a blog post reciting several false #covid19 claims and predictions made by Levitt over the course of the pandemic. That is not an "ad hominem attack". I reported Levitt's claims (with references). liorpachter.wordpress.com/2020/09/21/the… 2/14
Levitt, for his part, has responded to criticism of his failed predictions with non-sequiturs about attacks on free speech. 3/14
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(