There are people who desperately want this to be untrue🧵
One example of this came up earlier this year, when a "Professor of Public Policy and Governance" accused other people of being ignorant about SAT scores because, he alleged, high schools predicted college grades better.
The thread in question was, ironically, full of irrelevant points that seemed intended to mislead, accompanied by very obvious statistical errors.
For example, one post in it received a Community Note for conditioning on a collider.
But let's ignore the obvious things. I want to focus on this one: the idea that high schools explain more of student achievement than SATs
The evidence for this? The increase in R^2 going from a model without to a model with high school fixed effects
This interpretation is bad.
The R^2 of the overall model did not increase because high schools are more important determinants of student achievement. This result cannot be interpreted to mean that your zip code is more important than your gumption and effort in school.
If we open the report, we see this:
Students from elite high schools and from disadvantaged ones receive similar results when it comes to SATs predicting achievement. If high schools really explained a lot, this wouldn't be the case.
What we're seeing is a case where R^2 was misinterpreted.
The reason the model R^2 blew up was because there's a fixed effect for every high school mentioned in this national-level dataset
That means that all the little differences between high schools are controlled—a lot of variation!—so the model is overfit, explaining the high R^2
This professor should've known better for many reasons.
For example, we know there's more variation between classrooms than between school districts when it comes to student achievement.
It's well-known that a very small portion of the total criminal population is responsible for the overwhelming majority of all crime.
A new study shows that this is also true of prison misconduct:
Just 10% of prisoners are responsible for more than 70% of misconduct in prisons!
The above numbers were for males. Here are the numbers for female prisoners.
The numbers are eerily similar.
Misconduct overrepresentation holds adjusting for time served in prison, and being a high-misconduct prisoner is predicted by being younger, Black, having a more extensive criminal history, being a violent criminal, being in a state facility, using drugs, and mental disorders.
I used to like this chart, but now I think it's too misleading and we should leave it behind in 2024.
🧵
The key issue is how household size is adjusted for.
In the OP image, they divide by the square root of household size. This is problematic because it means Gen Z incomes are being inflated to the extent they live with their parents.
Generally, when I hear that the younger generations are more successful, what I think is that they're more successful in the stereotypical ways:
They've got relatively better jobs, relatively bigger homes, relatively faster cars and all that.
I was reminded of this yesterday when looking into national IQ estimates.
The "pseudo-analysis" style of critique is to just spit out tons of possible problems, to nitpick, and then to assume that means a whole enterprise is rotten without even checking if the critique holds.
The people who engage in this style of critique (example below) don't care for scientific reasoning about these topics.
They want purity by their arbitrary and inconsistent standards, not correctness, not a 'best effort' to get make progress on finding answers.
So they misrepresent what people do and say; they attack strawmen; they claim people are wrong based on reasons that don't affect actually make them wrong, but they never check; they fail to understand the basics of the things they're contesting, but they act confident; etc.
This post got 50,000 likes and it never even pointed out the actual issue with the calculations, it just took issue with framing and it expressed that Kareem is too inept to find sources.
But what's new?
Kareem debunking thread below
Kareem says this is a "textbook example of how to lie with statistics."
It really isn't, but let's see what he bases this on.
The first thing he says -- his "main criticism" -- is that the data isn't provided. But for Kareem, this is completely meaningless.
We know this is meaningless, because even when all the data is presented, Kareem still doesn't do anything with it, understand it, open it, manipulate it, or anything.
He says "where's the data?" and when he gets it, he just blocks you.
A major problem with the healthcare system is that patients lie to their doctor.
Most patients will even privately admit that they lied when they were informing their doctors about their issues. Their reasons for doing this often aren't very good:
Patients want to avoid getting lectured, they don't want their doctor to call them fat or tell them their snacking habits are unhealthy. They're afraid the doctor will judge them or think they're stupid or immoral, and they don't want the doctor to tell their family.
But because people want to preserve their privacy even in the private setting of a doctor's office, they end up making doctors' jobs harder.
They make it harder to diagnose conditions and to prescribe the right drugs.