Our paper, "An evidence review of face masks against COVID-19", written by a cross-disciplinary team of 19 international experts, was published in the Proceedings of the National Academy of Sciences today.
The paper, which includes 141 references (yes, I read every one of them!) argues that we should increase focus on a previously overlooked aspect of mask usage: mask wearing by infectious people ("source control"), rather than only mask wearing by susceptible people ("PPE")
Masks have been used to help control respiratory pandemics for at least 600 years. Wu Lien-Teh (the "Plague Fighter") showed the world the importance of masks nearly 100 years ago, doing detailed studies over many years.
Sadly, his work became largely forgotten in the west
Unfortunately, it's impossible to study the impact of masks as source control using the gold standard: a "randomized controlled trial". That's because you can't really tell whether a mask wearer infects others or not. So we developed a new framework to study this topic
There are a number of very strong multivariate population-levels studies that are strongly suggestive of the impact of mask wearing. Particularly that of @ChrisLefflerMD et al ajtmh.org/content/journa…
We were lucky enough to have one of the world's top aerosol scientists, Prof Vladimir Zdimal, on our team, who helped explain how masks can block infectious particles, and the impact of aerosols
Personally, the studies I found most compelling are those that simply physically showed that masks literally block the ejection of respiratory particles
We were lucky enough to have @zeynep and @HeleneMarivdW on the team, who explained the sociological considerations around mask wearing, including looking at risk compensation behavior
We wrote the first version of this paper back in April, and it became the most viewed paper of all time on any topic, on preprints.org.
One key section we've added since that time is "Further Research" - that's a lot we still don't (but need to!) know
I'm glad @levelsio checked this, but sad our contrib has been erased by later big tech co's. Alec Radford said ULMFiT inspired GPT. ULMFiT's first demo predated BERT.
Today's 3-stage LLM approach of general corpus pretraining and 2 stages of fine-tuning was pioneered by ULMFiT.
There have been many other important contributions, including attention (Bahdanau et al), transformers, RLHF, etc.
But before all this, basically everyone in NLP assumed that each new domain needed a new model. ULMFiT showed that a large pretrained model was actually the key.
I got push-back from pretty much everyone about this. My claim that fine-tuning that model was the critical step to achieving success in NLP was not something people were ready to hear at that time.
I gave many talks trying to convince academics to pursue this direction.
Announcing fasttransform: a Python lib that makes data transformations reversible/extensible. No more writing inverse functions to see what your model sees. Debug pipelines by actually looking at your data.
We took the `Transform` class out of fastcore, replaced the custom type dispatch system with @ikwess's plum-dispatch, mixed it all together, and voila: fasttransform! :D
Wow, actual grown men are still doing the "I asked the LLM about itself and it said" thing.
In 2025.
Folks, LLMs don't know anything about how they themselves are built or deployed, unless they've been explicitly programmed with that information (which they almost never are).
I've recently been surprised to discover that a few of my friends are choosing to use nicotine to help them with focus, even though they are not ex-smokers.
I decided to look into it, and it turns out that there are documented health benefits of nicotine for some people. 🧵
I specifically looked into nicotine for ADHD, since, at least among children, ADHD and giftedness go hand in hand statistically (which would apply in adulthood too), and because focus was mention as an area where nicotine can be helpful.
There is a great overview below. But "Very surprisingly, there are… no further… studies.
Research into active ingredients… is expensive.
In addition, nicotine has a very poor image… which impairs its marketability" adxs.org/en/page/192/ni…
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. 🧵
ModernBERT is available as a slot-in replacement for any BERT-like model, with both 139M param and 395M param sizes.
It has a 8192 sequence length, is extremely efficient, is uniquely great at analyzing code, and much more. Read this for details: huggingface.co/blog/modernbert
Seven months ago, @bclavie kicked things off, and soon @benjamin_warner & @antoine_chaffin joined him as project co-leads. I don't think anyone quite knew what we were getting in to…
It turns out that training a new, SoTA model from scratch is actually pretty hard. Who knew? 🤷