I am so excited that this new paper with @samisaguy is out. We explain how humans perceive number by mathematically deriving an optimal perceptual system with bounded information processing. Four N=100 pre-registered experiments match the predictions. nature.com/articles/s4156…
People have long hypothesized that there are two systems of number perception: an exact system that lets us perfectly perceive numbers less than about 4, and an approximate one for large numbers.
This observation dates back to William Jevons, who published an experiment in Nature in 1871 where he literally threw beans into a box and saw how accurately he guessed the number. He estimated low numbers perfectly, but could only approximate those above 4 or 5.
But for two systems, nobody explains what is special about low numbers, why high number estimation takes the exact form it does, why and how multiple systems arose in evolution, etc.
The idea of @samisaguy's paper is to "design" an efficient perceptual system for number. What if we got to design a representational system which minimizes error (brain-vs-world) but can only use a fixed amount of information to do so?
.@samisaguy set up an app to let anyone design a perceptual system, formalized as a distribution of values (what’s in your brain, given what’s in the world). Any distribution can be scored by accuracy and information required. How good can you do? colala.berkeley.edu:3838/unified_model/
Fortunately, we don't have to design these by hand -- in the paper we mathematically derive the minimum error system subject to the information processing bound. When you do that, you get a system that looks like this:
You get exactness for low numbers, approximately Gaussian high numbers with linearly increasing standard deviation, and people's attested under-estimation bias. Where you switch from exact to approximate changes with your information capacity.
We tested this by having people estimate numbers under varying time demands. Assuming that people accumulate information over time, the model predicts how their error should scale with number and display time. In these plots, left is the model, right is people.
It also makes predictions about how your average guess should scale with number. Left is model, right is people.
And finally it predicts what shape your response distribution should be -- unlike most models, this is not “built in,” but instead derived by solving the optimization problem.
To fit all of these, we essentially have one free parameter (the information capacity), plus individual subject effects. We replicate these effects in three other N=100 experiments which vary other factors.
The overall point is that the difference between small and large numbers, the shape of distribution, underestimation, response to time demands, etc. all can be explained. No need to think about two systems; all you need is one efficient system with limited information capacity.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
It is an amazing time to work in the cognitive science of language. Here are a few remarkable recent results, many of which highlight ways in which the critiques of LLMs (especially from generative linguistics!) have totally fallen to pieces.
One claim was that LLMs can't be right because they learn "impossible languages." This was never really justified, and now @JulieKallini and collaborators show its probably not true:
One claim was that they LLMs can't be on the right track because they "require" large data sets. Progress has been remarkable on learning with developmentally-plausible data sets. Amazing comparisons spearheaded by @a_stadt and colleagues:
Yes, ChatGPT is amazing and impressive. No, @OpenAI has not come close to addressing the problem of bias. Filters appear to be bypassed with simple tricks, and superficially masked.
Yeah, yeah, quantum mechanics and relativity are counterintuitive because we didn’t evolve to deal with stuff on those scales.
But more ordinary things like numbers, geometry, and procedures are also baffling. Here’s a little 🧵 on weird truths in math.
My favorite example – the Banach-Tarski paradox – shows how you can cut a sphere into a few pieces (well, sets) and then re-assemble the pieces into TWO IDENTICAL copies of the sphere you started with.
It sounds so implausible, people often think they've misunderstood. But it's true -- chop into a few "pieces" and reassemble to two *identical* (equal size, equal shape) spheres to what you started with.
Everyone seems to think it's absurd that large language models (or something similar) could show anything like human intelligence and meaning. But it doesn’t seem so crazy to me. Here's a dissenting 🧵 from cognitive science.
The news, to start, is that this week software engineer @cajundiscordian was placed on leave for violating Google's confidentiality policies, after publicly claiming that a language model was "sentient" nytimes.com/2022/06/12/tec…
Lemoine has clarified that his claim about the model’s sentience was based on “religious beliefs.” Still, his conversation with the model is really worth reading: cajundiscordian.medium.com/is-lamda-senti…