I have a pretty major update for one of my articles.
It has to do with Justice Jackson's comment that when Black newborns are delivered by Black doctors, they're much more likely to survive, justifying racially discriminatory admissions.
We now know she was wrong🧵
So if you don't recall, here's how Justice Jackson described the original study's findings.
She was wrong to describe it this way, because she mixed up percentage points with percentages, and she's referring to the uncontrolled rather than the fully-controlled effect.
After I saw her mention this, I looked into the study and found that its results all seemed to have p-values between 0.10 and 0.01.
Or in other words, the study was p-hacked.
If you look across all of the paper's models, you see that all the results are borderline significant at best, and usually just-nonsignificant, which is a sign of methodological tomfoolery and results that are likely fragile.
With all that said, I recommended ignoring the paper.
Today, a reanalysis has come out, and it doesn't tell us why the coefficients are all at best marginally significant, but instead, why they're all in the same direction.
The reason has to do with baby birthweights.
So, first thing:
(A) At very low birthweights, babies have higher mortality rates, and they're similar across baby races;
(B) At very low birthweights, babies have higher mortality rates, and they're similar across physician races.
Second thing: Black infants tend to have lower birthweights.
MIxed infants tend to birthweights in-between Blacks and Whites, and there's a mother effect, such that Black mothers have smaller mixed babies than White mothers (selection is still possible)
(A) Black babies with high birthweights disproportionately go to Black doctors;
(B) The Black babies sent to White doctors disproportionately have very low birthweights.
If you control for birthweight when running the original authors' models, two things happen.
For one, they fit a lot better.
For two, the apparently beneficial effect of patient-doctor racial concordance for Black babies disappears:
At this point, we have to ask ourselves why the original study didn't control for birthweight. One sentence in the original paper suggests the authors knew it was a potential issue, but they still failed to control for it.
PNAS also played an important role in keeping the public misinformed because they didn't mandate that the paper include its specification, so no one could see if birthweight was controlled. If we had known the full model details, surely someone would have called this out earlier.
Ultimately, we have ourselves yet another case of PNAS publishing highly popular rubbish and it taking far too long to get it corrected.
Let me preregister something else:
The original paper will continue to be cited more than the correction with the birthweight control.
The public will continue to be misled by the original, bad result. PNAS should probably retract it for the good of the public, but if I had to bet, they won't.
So people like Justice Jackson will continue to cite it to support their case for racial discrimination.
They'll continue doing that even though they're wrong.
In the U.S., immigrants commit crimes at a lower rate than natives do.
But this wasn't always so true!
In the 19th century, immigrants and natives were much more similar in terms of how often they committed crimes🧵
One possibility?
Changing racial composition in "US-Born". This might happen because Blacks--who do crimes at higher rates than Whites--were a growing share of that category.
But that isn't it. Subset to Whites, same result, albeit with different timings and magnitudes:
Another possibility?
Changes in the sourcing of immigrants. Immigrants might come from places with less crime than they used to.
Alas, this is wrong. If anything, they come from places with more crime today. All sorts of adjustments don't change the main picture here.
A few days ago, Biden commuted the death sentences of almost every federal death row inmate.
Every single person whose sentence Biden commuted was verifiably evil and clearly earned the death penalty.
Let's go through all 37🧵
Shannon Agofsky drowned a bank manager alive, received life in prison, and in prison, kept talking about how he was itching to beat up other prisoners.
Then he killed a fellow prisoner by stomping his neck in and causing him to drown in his own blood.
On camera. Guilty.
Billie Allen killed a bank guard during a bank robbery, using a semi-automatic weapon.
Allen and his accomplice also stole two vans to use as getaway vehicles the night before.
He was inspired by the movies "Set It Off" and "Heat" and he was caught red-handed. Guilty.
"Without Mohammed, Charlemagne would have been inconceivable."
This quote describes Pirenne's thesis that Antiquity—the period when economic activity concentrated in the Mediterranean—ended because the rise of Islam destroyed the flow of trade across it.
The decline in trade that resulted from differences in faith had profound consequences for the economic geography of Europe.
Byzantine economic activity depended on trade, and it collapsed, whereas the Frankish economy, which was never trade-dependent, transformed.
The Byzantines' minting stalled and the Arabs' and Franks' increased (perhaps partly because they were cut off from one another!), providing each of their states with divergent trends in seignorage revenues and a widening gulf in the ability to fund the government.
Robustness tests are supposed to show a study's results hold up no matter how you reasonably change the specification
But we live in a world with p-hacking, and people p-hack robustness tests
Compared to unshown robustness tests (blue), what we get is suspiciously significant!
This is the distribution of z-values for different tests in economics papers, coupled with the robustness tests their authors presented, and other, plausible robustness tests they didn't.
Clearly people p-hack, and they p-hack tests that are supposed to make us think they didn't
It's sad this is the case. Were it not, it might be useful to get a surprising, marginally-significant result, and then show that it holds up across different permutations of the results
But because the robustness tests shown are selective, their potential utility is unrealized
Let's talk about the glass delusion, the Middle Ages' bout with a mass psychogenic illness marked by people believing they were made of glass.
Glass was a valuable commodity in Europe. It was primarily owned by the noble and well-to-do, and it had a notable purpose in alchemy.
Its perception as the technology of the time was as one that's both fragile and valuable, like the nobility.
Glass was the relatively novel technology people knew, and they knew things could be transmuted into glass. Delusional people also thought transmutation could affect them.