What do the Washington Post, Brookings, The Atlantic, and Business Insider have in common?
They all employ credulous writers who don't read about the things they write about.
The issue? Attacks on laptop-based notetaking🧵
Each of these outlets (among many others, unfortunately) reported on a a 2014 study by Mueller and Oppenheimer, in which it was reported that laptop-based note-taking was inferior to longhand note-taking for remembering content.
The evidence for this should not have been considered convincing.
In the first study, a sample of 67 students was randomized to watch and take notes on different TED talks and then they were assessed on factual or open-ended questions. The result? Worse open-ended performance:
The laptop-based note-takers didn't do worse when it came to factual content, but they did so worse when it came to the open-ended questions.
The degree to which they did worse should have been the first red flag: d = 0.34, p = 0.046.
The other red flag should have been that there was no significant interaction between the mean difference and the factual and conceptual condition (p ≈ 0.25). Strangely, that went unnoted, but I will return to it.
The authors sought to explain why there wasn't a difference in factual knowledge about the TED talks while there was one in ability to describe stuff about it/to provide open-ended, more subjective answers.
Simple: Laptops encouraged verbatim, not creative note-taking.
Before going on to study 2: Do note that all of these bars lack 95% CIs. They show standard errors, so approximately double them in your head if you're trying to figure out which differences are significant.
OK, so the second study added an intervention.
The intervention asked people using laptops to try to not take notes verbatim. This intervention totally failed with a stunningly high p-value as a result:
In terms of performance, there was once again nothing to see for factual recall. But, the authors decided to interpret a significant difference between the laptop-nonintervention participants and longhand participants in the open-ended questions as being meaningful.
But it wasn't, and the authors should have known it! Throughout this paper, they repeatedly bring up interaction tests, and they know that the interaction by the intervention did nothing, so they shouldn't have taken it. They should have affirmed no significant difference!
The fact that the authors knew to test for interactions and didn't was put on brilliant display in study 3, where they did a different intervention in which people were asked to study or not study their notes before testing at a follow-up.
Visual results:
This section is like someone took a shotgun to the paper and the buckshot was p-values in the dubious, marginal range, like a main effect with a p-value of 0.047, a study interaction of p = 0.021, and so on
It's just a mess and there's no way this should be believed. Too hacked!
And yet, this got plenty of reporting.
So the idea is out there, it's widely reported on. Lots of people start saying you should take notes by hand, not with a laptop.
But the replications start rolling in and it turns out something is wrong.
In a replication of Mueller and Oppenheimer's first study with a sample that was about twice as large, Urry et al. failed to replicate the key performance-related results.
Verbatim note copying and longer notes with laptops? Both confirmed. The rest? No.
So then Urry et al. did a meta-analysis. This was very interesting, because apparently they found that Mueller and Oppenheimer had used incorrect CIs and their results were actually nonsignificant for both types of performance.
Oh and the rest of the lit was too:
Meta-analytically, using a laptop definitely led to higher word counts in notes and more verbatim note-taking, but the performance results just weren't there.
The closest thing we get in the meta-analysis to performance going up is that maybe conceptual performance went up a tiny bit (nonsignificant, to be clear), but who even knows if that assessment's fair
That's important, since essays and open-ended questions are frequently biased
So, ditch the laptop to take notes by hand?
I wouldn't say to do that just yet.
But definitely ditch the journalists who don't tell you how dubious the studies they're reporting on actually are.
I know just one person over 100 with an actual birth certificate.
Across U.S. states, the total and per capita numbers of supercentenarians dramatically decline right after the introduction of birth certificates (blue line).
The fact that the most significant crime, socially, is violent crime, and it's not really driven by the economy should change the way we see and talk about crime.
Despite strong results, it doesn't seem to have permeated the public discourse.
There was a point in time when London shut down 70% of its police stations as part of a series of austerity cuts.
That was a bad idea🧵
Background:
A 2010 report from the British government led to a 29% budget cut for London's police.
In response, the mayor figured cutting down police stations and redistributing the frontline officers across the remainder could save money while achieving similar results.
The police stations the mayor's office decided to shut down were fairly geographically equally distributed in London, and they respected local crime trends.
It's therefore plausible that the remaining stations could make up for the absence of the ones that were shut down.
If I want to do a study on Holocaust survivors and I go and seek out people who survived it, I am looking for a select sample.
If, instead, I look in datasets that were sampled without respect to Holocaust survival and find survivors, my sample is nonselect.
Why does this matter?
Select respondents differ from nonselect ones because they elect to be sampled or because I was able to find them by virtue of something that differentiates them from the population.
For example, my Holocaust survivors might be part of a support group.
These are the Baths of Caracalla. Or at least, this is what remains of them today.
These ruins might not look impressive now, but when they were constructed they might have been one of the finest examples of Roman architecture.
But then Europe forgot how to build them🧵
To get an idea of what the Baths looked like in their heyday, look at this rendering.
This palatial compound must have been a sight to behold since the baths rivaled medieval cathedrals like Laon, Notre-Dame, and Salisbury in scale.
To put numbers on it, the bath building itself was 228 meters long, 116 meters wide, and 38.5 meters tall, with capacity for an estimate 1,600 bathers in a complex with 13 hectares of sumptuous decoration.