My blog passed 3 million views today from more than 1.8 million visitors. There have been a total of 119 posts in just over 10 years.
I'm one of those visitors. The blog is an idea repository and I go back sometimes for recall. Some highlights 1/🧵 liorpachter.wordpress.com
Just today I revisited the PCA post to recall some of the properties of the transform. A student, Nick Markarian, taught me the Borel-Kolmogorov paradox today (topic for a future post) and the post was helpful in thinking about some things. 2/ liorpachter.wordpress.com/2014/05/26/wha…
There are other posts I don't use regularly but perhaps some will find them useful. This post on Mahalanobis distance and heatmaps is very relevant for single-cell RNA-seq: 4/liorpachter.wordpress.com/2014/01/
One of my personal favorites is this post on "number deconvolution", which breaks down the math incoherence in the "network deconvolution" (ND) method from the Kellis lab via a simple illustration. 6/ liorpachter.wordpress.com/2014/02/18/num…
Despite my three blog posts explaining in detail why the ND method is fundamentally flawed, incoherent, and poor in practice, it's still used. Then again, those who use it obtain poor results. Most recently who lament "the poor performance of ND". 🙃 7/biorxiv.org/content/10.110…
There are some posts that were really fun to write but didn't seem to be read much. That's fine, and they are still fun for me to go back to once in a while. This is a good read, I think, if you're designing genomics experiments... 10/liorpachter.wordpress.com/2015/02/23/col…
Perhaps my most impactful post was "Pachter's p-value prize". I learned more from the comments on this post than I have from many classes I've taken. Perhaps its time for another prize post..
11/liorpachter.wordpress.com/2015/05/26/pac…
A lot of my blog posts are about looking at things in computational biology "the right way". I wrote a post a few years ago about the professor who taught me the importance of this. 13/liorpachter.wordpress.com/2016/05/09/the…
Over the years a recurring theme has been the interactions between mathematics and biology, a reflection on Gian Carlo-Rota's quip that "The lack of real contact between mathematics and biology is either a tragedy, a scandal or a challenge, it is hard to decide which." 14/
Speaking of time, a single blog post can take me as long as writing a paper. I think it's time well spent. I've learned a ton writing these posts, in the same way that I've learned a ton from teaching. Sometimes I write a blog post just to learn, e.g. 16/liorpachter.wordpress.com/2013/09/18/uni…
I started the blog with the goal of writing technical posts about computational biology. I've largely stuck to that, but as the years went on I ventured into other issues which I think are important for the field. 17/liorpachter.wordpress.com/2018/05/18/jam…
I've also written some controversial posts. This one, on how Salmon copied kallisto without proper attribution is one example:
Looking back at it years later I don't understand why it was controversial. 18/liorpachter.wordpress.com/2017/08/02/how…
I've been asked if I regret any of my posts. I do wish some had been written differently and/or focused on different issues. An early post on 23andme missed the mark. But I've kept the bad work online for the record. I strive to improve. 19/liorpachter.wordpress.com/2013/11/30/23a…
My blog posts are not DOIed, but I've rarely changed them after they are posted (with the exception of correcting minor typos, or fixing errors people point out). In retrospect I wish they were DOIed, because they are cited. 82 citations on Scholar: 20/scholar.google.com/scholar?hl=en&…
I've been asked on occasion whether there were any posts that were difficult to write. This one was by far the hardest:
I still don't know the secret to writing. Sometimes I have writers block. Sometimes I can squeeze out the words. 21/liorpachter.wordpress.com/2020/06/10/bla…
One of the challenges in running the blog has been moderating the comments. Early on I decided to adopt a very lenient policy and basically let anyone who wants to comment write on the blog without censorship. The "official" policy is here:
22/liorpachter.wordpress.com/about/
But overall I've received overwhelming support and positivity and I thank the numerous readers who have enriched my scientific life via the blog. Thanks you 🙏.
I may post a second update in another decade. 24/24
• • •
Missing some Tweet in this thread? You can try to
force a refresh
This year I had the privilege of enjoying in-person conferences again, and in April I met @dvir_a & Dan Gorbonos, from whom I learned a bunch of interesting science. Here we are having burgers at Hans im Glück in Bonn.
And now, a 🧵about genocide.. 1/
The topic came up at dinner. History presents a heavy burden for Jews in Bonn.. even 78 years after WWII. The "Hans in luck" restaurant we were dining at is just a few meters from where the local synagogue was burned down during "Kirstallnacht" in 1938. 2/
Although decades have passed since the holocaust, in Bonn the events felt closer in time. We were attending the Bonn Conference on Mathematical Life Sciences, which held a moment of silence for Holocaust Remembrance Day while we were there. 3/
The virial theorem is a 150-year old tool in (astro)physics. First described by Rudolf Clausius in 1870 in connection with studies of heat transfer, it gained prominence after it was used by Fred Zwicky in 1933 to posit the existence of dark matter. 2/
The virial theorem is elementary calculus. For objects w/ mass m_1,..,m_n at positions z_1,..,z_n, velocities v_1,..,v_n, & acted on by forces F_1,...F_n, the virial "theorem" is the identity shown below. S = \sum_i p_iz_i (p_i is momentum), U is potential energy; T, kinetic.3/
The speech begins with the claim that the public has been losing faith in universities because universities are pushing political agendas instead of catering to excellence. But he provides no evidence of a link.
He says that in 2016 70% headed for college vs 62% now. But provides no evidence that it's because of "American universities abandoning focus on excellence... to pursue... DEI".
We have molecule counts X_{ig} where i ranges over cells & g over genes, and we consider two groups of cells G_1 and G_2 containing n_1 and n_2 cells respectively. Let's start with Seurat which calculates LFC according to the formula below. But what is Y_{ig}? 3/
The "performance" in this analysis boils down to checking consistency of the kNN graph after transformation. That's certainly a property one can optimize for, but it's by no means the only one. In fact, if it was the only property of interest, one could just not transform. 2/
Of course that is trivial and uninteresting. The purpose of normalization is to remove technical noise and stabilize variance. But then one should check how well that is done. And as it turns out, log(y/s+1) actually removes too much "noise". 3/
In a recent preprint with @GorinGennady (biorxiv.org/content/10.110…) we provide a quantitative answer to to this question, namely what information about variance (among cells in a cell type, or more generally many cell types) does a UMAP provide? A short🧵1/
The variability in gene expression across cells can be attributed to biological stochasticity and technical noise. In practice it's hard to break down the variance into these constituent parts. How do we know what is biological vs. technical? 2/
Here's an idea: within a cell type, we can obtain an accurate estimate of gene expression by averaging across cells. Now we can get a lower bound for biological variability by computing the variance across very distinct cell types. 3/