Here is why IQ is bullshit.

A thread.
A recent paper in @PsychScience relied on national differences in IQ that held that the average IQ of some countries is below the threshold for intellectual disability. Image
Christian Ebbesen wrote a damning paper critique psyarxiv.com/tzr8c (which the figure just above is from) ImageImage
Nick Brown @sTeamTraen found the results had been “pre”-registered after it was submitted.
Beyond this one study, international IQ results put researchers in an awkward position. Even if we acknowledge their samples weren’t representative of these nations, there’s still the question of why the original studies found so many children with such low IQ in these countries
Maybe it’s possible IQ tests don’t measure what they claim. Hints abound.
I once saw a researcher present results that an indigneous group I work with, the Tsimane’, have an average IQ of around 70. But I know from leading lots of experiments with them that they approach written tests very differently because they don't have years of testing in schools
They often aren't interested in doing our tasks---or doing them how we expected---and why would they be? They're foraging-farmers who live a traditional lifestyle, and few can read and write, and so even the idea of a test, much less experiment, is foreign.
In our own tasks, when we find low scores, we usually discover it’s due to lack of clarity by experimenters (or translators), not evidence of inability. We address this by having controls to ensure participants understood the task. But IQ tasks don’t have a control condition.
Nobody who knows the Tsimane’ would think they are intellectually handicapped--they have, for instance, extensive ethnobotanical knowledge that would surpass most US adults.
science.sciencemag.org/content/299/56…
But ethnobotanical knowledge, or thinking about causal relations in natural kinds, or training dogs to hunt, or surviving alone in the rain forest aren't tested in IQ tests. What is tested instead in "culture fair" IQ tests are visual relations, geometric shapes, patterns, etc Image
But cultures differ in, for example, whether they have words for shapes, spatial relations, etc. These differences influence how people think about and use categories @glupyan (sapir.psych.wisc.edu/papers/lupyan_… )
We've found that many Tsimane’, for instance, don't know shape labels. Then why should anyone *ever* run tests that could be influenced by knowledge of shapes with the Tsimane’?
(It’s actually not clear that people without printed materials understand pictures in the same way we do. Here's Tepilit Ole Saitoti, from his autobiography, The Worlds of a Maasai Warrior) Image
For a “culture free” IQ test, Cattell even suggested that people could be given mazes. I wonder if he thought about the fact that some indigenous people have never been in a hallway. Image
The only way to sum this up is that the term "culture fair" is just a total fabrication.
Thinking about large cultural differences between people who live far away and speak different languages (non-WEIRD-vs-WEIRD) should motivate us to think about potential cultural and experience-based differences between people who speak the same language and live nearby.
Such differences were highlighted decades ago with examples that show the cultural baggage inherent in creating any test. If you construct the right IQ test, as in "The Black Intelligence Test of Cultural Homogeneity", black children score higher than white children. ImageImageImage
(Another example is the "Dove Counterbalance General Intelligence Test", though it's rightly condemned for racist stereotypes baltimoretimes-online.com/news/2013/oct/…)
And bias in psychology is not just in IQ testing.
Besides the unavoidable culture-ladenness of completing a written exam, there are other good reasons to doubt IQ tests meaningfully measure ability.
One big one is that performance on them is sensitive to factors that nobody would consider "ability." For instance, if you pay people more, they will do better. The size of the effect across studies averages to about 0.5 SD @angeladuckw pnas.org/content/108/19… Image
(Note that this meta-analysis included some apparently fraudulent data (Breuning) but the result still holds once it's removed)
Moreover, as motivation would predict, the effect of reward in IQ test performance appears sensitive to the size of reward. Image
So, IQ tests in part measure how much effort you're willing to exert, not ability. What's worse, we don't know how large an effect motivation *could* have w/ optimal intervention because the field has not considered it important to figure out the dose-response curve.
Demonstration of effects of extrinsic motivation need to make us worry about individual variation. The problem is that individuals and groups almost certainly vary in motivation--people in extreme situations or poverty or other countries may not care much about doing your test.
A second example can be found in large effects of coaching and practice, which has effect sizes up to 0.15-0.43 SDs (again, no determination of how large the effect could be made to be).
psycnet.apa.org/record/1984-16…
Even if most people aren’t explicitly coached, any effect is troubling because it makes you wonder what other activities *could* effectively function like coaching (e.g. individual variation in parental talk about general academic or test or metacognitive strategies, etc.)
Overall, these findings reflect a primary failing of IQ research: it *equates* ability with outcome on a standardized test. This assumption is so ingrained in research assumptions, it’s rarely questioned.
But in other areas of psychology, this assumption is common enough it has a name: “the fundamental attribution error”---presupposing behavior (test performance) is driven by internal ("ability") rather than external factors.
en.wikipedia.org/wiki/Fundament…
A study that illustrates this perfectly in a nearby domain is the marshmallow task. Early studies by Walter Mischel reported correlations between the ability to delay gratification and life outcomes.
But @celestekidd (et al.) realized that these studies were confounded by environmental reliability: for instance, maybe poor kids *should not* delay gratification because, based on their experiences, promised rewards are less likely to materialize.
As with motivation, a way to test this is to manipulate reliability in the lab. When do you do that, you find the manipulation completely drives whether kids wait.
celestekidd.com/papers/KiddPal…
It’s not that kids *can’t* wait. It’s not “ability.” It's a rational judgement about a reward's expected value. And the effect is huge---ceiling and floor. (It was recently replicated by an independent lab: sciencedirect.com/science/articl…) Image
Many arguments made about IQ were also made to support the marshmallow task--it correlates with real world outcomes, it's stable over time, varies reliably across cultures and demographics, etc.
These arguments all turn out to be confounded because nobody considered how external third variables--in that case, situation reliability--might affect performance on the underlying task itself, and be correlated with those other outcomes.
That points to a big problem for you too, IQ.
The wrongheaded jump to internal causes should be a real lesson. A measure can be replicable, consistent over a lifespan, correlated to life outcomes… and not at all capture what you think it does.
Note that this isn’t a debate about g or positive manifold. It’s not a question about nature vs nurture. It’s a question about whether the underlying data that go into these studies measure an “ability” or something else.
I’m also not claiming there is no variation in intellectual capacity between people. The claim is that you don’t know what drives variation in performance. Tests haven’t been controlled for countless unmeasured, “non-ability” factors that might influence test scores.
Motivation? Not measured. Expectations? Not measured. Test familiarity? Not measured. Interest in the task? Not measured. Attention? Not measured. Confidence? Not measured. Wandering thoughts about financial insecurity? Not measured.
Stress at home? Not measured. Hunger because you couldn't buy breakfast? Not measured. How much do you care about making the experimenter or teacher happy? Not measured. Rapport with your teachers or experimenters? Not measured.
Linguistic confidence in the test language? Not measured. Native dialect? Not measured. Belief you'll get the promised incentive? Not measured. Knowledge of test taking strategies? Not measured. Confidence you understood instructions? Not measured.
What do you know or believe about your own intelligence? Not measured. Do you like or hate puzzles or puzzle games? Not measured. What did parents say or do to ensure kids cooperate in the test or experiment? Not measured.
How easily do you understand what experimenters want? Not measured. How much do you like doing what others want? Not measured. (@CantlonLab told me about a kid in an IQ test who said "mouth" when asked to name a container for cereal. It was a good answer, marked incorrect.)
Even just the financial concerns here are difficult to control because even SES is complex, possibly idiosyncratic. Your income doesn’t tell me your number of dependents, loans, savings, debt, security net, trust funds, expected inheritances, ability to work overtime, etc.
So, another confounded correlational study won’t solve this problem. It’s *interventions* on possible “non-ability” factors that might. If you see that interventions have effects, you know that outcomes depend on external factors.
It’s also clear that heritability of “IQ” doesn’t help either because heritable factors might not be about “ability” (e.g. preferences for certain kinds of activities, sensitivity to doing what others want or expect, etc.), but maybe that’s a story for another thread.
One important point about heritability, though, is that genes affect many aspects of your life, including what experiences you have. So even if “IQ” and genes correlate, this might be due to confounds--for instance, discrimination in educational opportunity based on how you look
Phew, almost done.
A way to summarize all this is that "ability" should mean what you do under the “optimal” circumstances for an individual, which will vary person-to-person. “Ability” is not what each person happens to bring to some boring test I just put in front of you
en.wikipedia.org/wiki/L%C3%A1sz…
So, we don’t know IQ tests measure ability. We have no real reason to think they do what they claim. I sometimes get pointed to articles comparing a few other factors, but what you need is to simultaneously control *everything* in *every* claim to justify that.
When I discuss this, some people imply it's presumptuous--as though I'm claiming to resolve a century old debate. But the point is the opposite---*nobody* has figured it out.
We don't understand the functioning of a single cognitive mechanism. We don't understand C elegans. We definitely don't understand how external and internal pressures factor into test-taking performance.
And the problem in pretending that these debates somehow shouldn’t continue because they’re old or the science is “settled” (which it’s not) can be seen in the uncritical way reviewers and editors approached the Clark study.
I also hear "All that's been said before!" And that's right, it has. The problem is that many basic critiques haven't been refuted or handled--they’ve been ignored. Here’s, Boykin (via the paper in @duane_g_watson’s thread above. Image
I suspect basic questions get ignored because people are excited to apply fancy new statistical tools, genetic tests, etc. Finally, another factor analysis. But if the underlying tests aren’t valid measures of “ability”, what's the point? What could the results possibly mean?
The dismal truth is that ignoring what test performance really means, allowing them to be built into the structure of society, leads to terrible consequences. The blame goes to psychology. An example is told through the San Francisco school system: kalw.org/post/legacy-mi…
Here's a paragraph from the ruling in that case: Image
The cross-cultural results on IQ aren’t an error or a fluke. They’re a canary. They show us the baggage that IQ tests bring. Once you see that these measures don’t do what they claim across cultures, we have to doubt that they measure what they claim within cultures too.
(@threadreaderapp unroll please)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with steven t. piantadosi

steven t. piantadosi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @spiantado

Jan 25
I am sooooooo excited for this paper. We've spent years developing a super fast program induction library. We use it to learn key pieces of language structure.

So much of what Chomskyan linguists say about learnability is totally wrong.

🧵

pnas.org/content/119/5/…
We show that a program-learning model can construct grammars and other computational devices from just observing utterances (sentences). It takes just a tiny amount of data to learn key patterns and structures from natural language which have been argued to be innate/unlearnable
We also show that this kind of learning model can acquire the patterns used in artificial language learning experiments, like @jenny_saffran, Aslin, and Newport.
Read 30 tweets
Oct 28, 2021
Did you know that there are psychologists who study "stereotype accuracy"? I've always wondered what the hell, so I've been reading it recently. For the record, it’s exactly as bad as it sounds.

Here’s a thread.
I started with @PsychRabble et al's review paper, "Stereotype accuracy: one of the largest and most replicable effects in all of social psychology." It claims the correlation between stereotypes and reality is stronger than most effects in psychology.
gwern.net/docs/psycholog…
Many papers on "stereotype accuracy" seem to begin by complaining that everyone assumes stereotypes are inaccurate (e.g. the word "stereotype" is *defined* to mean inaccurate) without evidence.
Read 46 tweets
Jan 28, 2021
The GameStop $GME news is too funny not to write a thread about.

It's also the most interesting intersection of psychology and politics going on.
I think the best overview of what's happening is this NYT article:
nytimes.com/2021/01/27/bus…
Basically, some large companies bet that GameStop stock would go down. To most financial people, this would seem like a good bet: most video games are sold online now and physical stores are especially hurt by the pandemic. Video game stores aren't looking great.
Read 24 tweets
Nov 9, 2020
A new paper on (i) how to connect high-level cognitive theories to neuroscience / neural nets, and (ii) how learners can construct new concepts like number or logic without presupposing them. "Church encoding" is the metaphor we need.
link.springer.com/article/10.100…
Kids learn concepts like number or logic, but it's not really clear how that's actually possible: what could it mean for a learner (or computer!) to *not* have numbers or logic? Can you build a computer without them? And what do these abstract things mean on a neural level?
Church encoding is an idea from mathematical logic where you use one system (lambda calculus) to represent another (e.g. boolean logic). It's basically the same idea as using sets to build objects that *act like integers*, even though they're, well, sets.
en.wikipedia.org/wiki/Set-theor…
Read 21 tweets
Oct 2, 2020
This article by @JeffFynnPaul has been going around. He argues that it’s a “myth” that Europeans took land from Native Americans. @cakrolik and I read it and it’s one of the worst argued pieces we’ve ever seen. Here is our thread.
spectator.co.uk/article/the-my…
The overall claim of his article is that liberals have perpetuated a “Myth of the Stolen Country” that the US “was founded by a monumental act of genocide, accompanied by larceny on the grandest scale.”
He compares the popularity of this “myth” in the US to ideology in Nazi Germany (Godwin’s law?) and communism. Image
Read 58 tweets
Sep 14, 2020
I am so excited that this new paper with @samisaguy is out. We explain how humans perceive number by mathematically deriving an optimal perceptual system with bounded information processing. Four N=100 pre-registered experiments match the predictions.
nature.com/articles/s4156… Image
People have long hypothesized that there are two systems of number perception: an exact system that lets us perfectly perceive numbers less than about 4, and an approximate one for large numbers. Image
This observation dates back to William Jevons, who published an experiment in Nature in 1871 where he literally threw beans into a box and saw how accurately he guessed the number. He estimated low numbers perfectly, but could only approximate those above 4 or 5. ImageImage
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(