What do the Washington Post, Brookings, The Atlantic, and Business Insider have in common?
They all employ credulous writers who don't read about the things they write about.
The issue? Attacks on laptop-based notetaking🧵
Each of these outlets (among many others, unfortunately) reported on a a 2014 study by Mueller and Oppenheimer, in which it was reported that laptop-based note-taking was inferior to longhand note-taking for remembering content.
The evidence for this should not have been considered convincing.
In the first study, a sample of 67 students was randomized to watch and take notes on different TED talks and then they were assessed on factual or open-ended questions. The result? Worse open-ended performance:
The laptop-based note-takers didn't do worse when it came to factual content, but they did so worse when it came to the open-ended questions.
The degree to which they did worse should have been the first red flag: d = 0.34, p = 0.046.
The other red flag should have been that there was no significant interaction between the mean difference and the factual and conceptual condition (p ≈ 0.25). Strangely, that went unnoted, but I will return to it.
The authors sought to explain why there wasn't a difference in factual knowledge about the TED talks while there was one in ability to describe stuff about it/to provide open-ended, more subjective answers.
Simple: Laptops encouraged verbatim, not creative note-taking.
Before going on to study 2: Do note that all of these bars lack 95% CIs. They show standard errors, so approximately double them in your head if you're trying to figure out which differences are significant.
OK, so the second study added an intervention.
The intervention asked people using laptops to try to not take notes verbatim. This intervention totally failed with a stunningly high p-value as a result:
In terms of performance, there was once again nothing to see for factual recall. But, the authors decided to interpret a significant difference between the laptop-nonintervention participants and longhand participants in the open-ended questions as being meaningful.
But it wasn't, and the authors should have known it! Throughout this paper, they repeatedly bring up interaction tests, and they know that the interaction by the intervention did nothing, so they shouldn't have taken it. They should have affirmed no significant difference!
The fact that the authors knew to test for interactions and didn't was put on brilliant display in study 3, where they did a different intervention in which people were asked to study or not study their notes before testing at a follow-up.
Visual results:
This section is like someone took a shotgun to the paper and the buckshot was p-values in the dubious, marginal range, like a main effect with a p-value of 0.047, a study interaction of p = 0.021, and so on
It's just a mess and there's no way this should be believed. Too hacked!
And yet, this got plenty of reporting.
So the idea is out there, it's widely reported on. Lots of people start saying you should take notes by hand, not with a laptop.
But the replications start rolling in and it turns out something is wrong.
In a replication of Mueller and Oppenheimer's first study with a sample that was about twice as large, Urry et al. failed to replicate the key performance-related results.
Verbatim note copying and longer notes with laptops? Both confirmed. The rest? No.
So then Urry et al. did a meta-analysis. This was very interesting, because apparently they found that Mueller and Oppenheimer had used incorrect CIs and their results were actually nonsignificant for both types of performance.
Oh and the rest of the lit was too:
Meta-analytically, using a laptop definitely led to higher word counts in notes and more verbatim note-taking, but the performance results just weren't there.
The closest thing we get in the meta-analysis to performance going up is that maybe conceptual performance went up a tiny bit (nonsignificant, to be clear), but who even knows if that assessment's fair
That's important, since essays and open-ended questions are frequently biased
So, ditch the laptop to take notes by hand?
I wouldn't say to do that just yet.
But definitely ditch the journalists who don't tell you how dubious the studies they're reporting on actually are.
I simulated 100,000 people to show how often people are "thrice-exceptional": Smart, stable, and exceptionally hard-working.
I've highlighted these people in red in this chart:
If you reorient the chart to a bird's eye view, it looks like this:
In short, there are not many people who are thrice-exceptional, in the sense of being at least +2 standard deviations in conscientiousness, emotional stability (i.e., inverse neuroticism), and intelligence.
To replicate this, use 42 as the seed and assume linearity and normality
The decline of trust is something worth caring about, and reversing it is something worth doing.
We should not have to live constantly wondering if we're being lied to or scammed. Trust should be possible again.
I don't know how we go about regaining trust and promoting trustworthiness in society.
It feels like there's an immense level of toleration of untrustworthy behavior from everyone: scams are openly funded; academics congratulate their fraudster peers; all content is now slop.
What China's doing—corruption crackdowns and arresting fraudsters—seems laudable, and I think the U.S. and other Western nations should follow suit.
Fraud leads to so many lives being lost and so much progress being halted or delayed.
British fertility abruptly fell after one important court case: the Bradlaugh-Besant trial🧵
You can see its impact very visibly on this chart:
The trial involved Annie Besant (left) and Charles Bradlaugh (right).
These two were atheists—a scandalous position at the time!—and they wanted to promote free-thinking about practically everything that upset the puritanical society of their time.
They were on trial because they tried to sell a book entitled Fruits of Philosophy.
This was an American guide to tons of different aspects of family planning, and included birth control methods, some of which worked, others which did not.
One of the really interesting studies on the psychiatric effects of maltreatment is Danese and Widom's from Nat. Hum. Behavior a few years ago.
They found that only subjective (S), rather than objective (O) maltreatment predicted actually having a mental disorder.
Phrased differently, if people subjectively believed they were abused, that predicted poor mental health, but objectively recorded maltreatment only predicted it if there was also a subjective report.
Some people might 'simply' be more resilient than others.
I think this finding makes sense.
Consider the level of agreement between prospective (P-R) and retrospective (R-P) reports of childhood maltreatment.
A slim majority of people recorded being mistreated later report that they were mistreated when asked to recall.
The Reich Lab article on genetic selection in Europe over the last 10,000 years is finally online, and it includes such interesting results as:
- Intelligence has increased
- People got lighter
- Mental disorders became less common
And more!
They've added some interesting simulation results that show that these changes are unlikely to have happened without directional selection, under a variety of different model assumptions.
They also showed that, despite pigmentation being oligogenic, selection on it was polygenic.
"[S]election for pigmentation had an equal impact on all variants in proportion to effect size."
I still think this is one of the most important recent papers on AI in the job market🧵
The website Freelancer added an option to generate cover letters with AI, and suddenly the quality associated with cover letters stopped predicting the odds of people getting hired!
LLMs do a few things to cover letters.
Firstly, they increase the quality, as measured by how well tailored they are to a given job listing.
Second, they make job applications in expensive, so people start spending less time shooting off applications.
More, rapidly-produced job applications becomes the norm.