In 2006 I went on a year-long sabbatical to @UniofOxford from @UCBerkeley. My grants were just ending and I thought I'd reset by doing some math after several years of genome consortia (I didn't have a biology mentor to tell me R01s can be renewed, so I didn't know & didn't try).
At @UniofOxford I was hosted by Philip Maini in Maths and @JotunHein in the Stats. It was a fun year in which I met @satijalab who was a student at the time. We ended up writing a paper on phylogenetics, alignment and annotation: academic.oup.com/bioinformatics…
With phylogenetics on my mind I invited one of my students at the time, Dan Levy (now a prof. @CSHL), to join me to work on a theoretical project related to NeighborNet. This ended up in a paper submitted in 2007 (published in 2011... math takes a while!) sciencedirect.com/science/articl…
This paper has all of its figures in black and white, and it really diminished its quality. Below you can see what one of the main figures looks like in the journal paper (in black and white, left) vs. in the arXiv (in color, right):
Why was the paper in B&W? I had no grant money. While I had paid part of my salary when the project started from an NSF grant, it too had run out. Dan Levy was paid from a @BBSRC grant. And the @ElsevierConnect wanted an arm and a leg for color. I just didn't have the $$.
Even without color the article was not free, and it was hard to find the money for it. And what did @ElsevierConnect do with the $? They introduced errors in my work. For example Jotun Hein's name was spelled correct in the arXiv & submission, yet is misspelled in the journal. 🤦🏻♂️
In biology, well-funded PIs don't blink at outrageous publication charges that run into the thousands even without open access. It's a small tax to pay from large grants, and a "high-profile" paper pays off in future grant dividends.
In the case of "consortia", the publication charges are a drop in the bucket from budgets that run in tens or hundreds of millions of dollars, and the juicy "packages" they pay for are a win-win: multiple papers for authors and multiple citations for journals.
As @rsidd120 points out, it's not only scientists who earn less than $11,500 in a month that are hurt. The costs for open access publishing have become outlandish and arguments that amount to little more than haggling over the price miss the forest for the trees.
The point of telling my black and white paper story is that costs are real, and they lead to difficult choices for many. I've been privileged and lucky to have been well-funded my entire career, and my experience with the NeighborNet paper taught me to plan ahead more carefully.
But planning ahead with grants is impossible nowadays with funding frequently dipping into the single digits. So yeah- let's burn this publication system to the ground and follow the lead of our math colleagues (@wtgowers et al.) blog.scholasticahq.com/post/introduci…
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The first database I curated by hand was for my Ph.D. thesis. It consisted of a database of 117 orthologous human and mouse genes (this was in the late 90s before either genome was sequenced!). It's still up: cb.csail.mit.edu/cb/crossspecie…
Compiling this database was hard. It required combing through GENBANK, performing alignments to check for orthology, examine proteins for homology etc. The database was generated for benchmarking a gene prediction tool, but I found that the curation had much more value than that.
The process of compiling the database taught me a ton about the state of gene sequences in GENBANK, challenges in sequence alignment, functional annotation etc. I learned a lot making this database. Also others found it useful in derivative work: korflab.ucdavis.edu/~genis/documen….
A friend (who does not work in science) asked me today whether it is true that "protein folding has been solved". My short answer:
The AlphaFold method produced very impressive results on CASP14. Protein folding is not a solved problem.
The AlphaFold results are impressive not just because they are (on average) much better than other methods, but because the improvement is so great in just the last 2 years that it suggests much more is still possible.
Also, the AlphaFold results are just markedly different from what a lot of other methods are producing. This is not an incremental improvement.
There has been discussion over the past week about what the new @Apple M1 chip means for bioinformatics. Some have predicted the end of compbio on @Apple. Others are more optimistic.
We got a Mac Mini & @pmelsted easily compiled kallisto bustools #scRNAseq on it. Results below:
Several points: 1. Compilation of code on the M1 ARM architecture was easy for kallisto and bustools because they have few dependencies. In fact we did it before for the ARM Rock64 which is why this time there was no problem with the M1.
2. @Apple has done a great job with Rosetta 2. M1 emulating x86 is still faster than previous Macs. And the extra cores are great for running kallisto. macrumors.com/2020/11/15/m1-…
In @NobelPrize news, the 2013 chemistry laureate links to a thread that says NIAID is "reminding people of their importance" right now because of a "vested interest" in maintaining high levels of @NIH funding, funding which they do not deserve.
From the outset of the #covid19 pandemic, it's been clear that risk of death increases sharply with age. But why? The intuitive hypothesis is that ACE2 expr. increases w/ age, but early in April, @sinabooeshaghi and I showed the opposite is true in mice. biorxiv.org/content/10.110…
Now, in a paper from the labs of @tuuliel and Christenson, @silvakasela et al. have performed a careful analysis in human, and they find the same.
BTW we saw the same patterns for ACE2 expression with sex in mice, namely males had *lower* levels of ACE2, and @silvakasela et al. find the same in humans despite the risk of death being much *higher* for males.