Everyone seems to think it's absurd that large language models (or something similar) could show anything like human intelligence and meaning. But it doesn’t seem so crazy to me. Here's a dissenting 🧵 from cognitive science.
The news, to start, is that this week software engineer @cajundiscordian was placed on leave for violating Google's confidentiality policies, after publicly claiming that a language model was "sentient"
Lemoine has clarified that his claim about the model’s sentience was based on “religious beliefs.” Still, his conversation with the model is really worth reading:
The response from the field has been pretty direct -- "Nonsense on Stilts" says @GaryMarcus
Gary's short piece cuts to the core of the issues. The most important is over-eagerness to attribute intelligence. This experiment from the 1940s illustrates it: people perceive beliefs, emotions, intentions, even when shown only moving shapes.
But, beyond that warning, I'm not sure I agree with much. First, it's just not true that systems which only do "pattern matching" are necessarily cognitively impoverished. Image
In fact, we've known since the earliest days of computing that just pattern matching (e.g. systems of rules that match patterns and rewrite) is capable of *arbitrary* computation.
So even a model that learns really well over "pattern matching" rules is potentially learning over the space of all computations.

(And that, btw, is a pretty good guess for what human learners do cell.com/trends/cogniti… )
This means that a smart "pattern matching" model might, in principle, acquire any computational structure seen in the history of cognitive science and neuroscience. Image
In other words, what matters is NOT whether the system uses “pattern matching” or is a “spreadsheet.” What matters is what computations it can actually learn and encode. And that’s far from obvious for these language models, which carry high-dimensional state forward across time.
Many have also doubted that large language models can acquire real meaning. The view is probably clearest in this fantastic paper by @emilymbender and @alkoller
Bender and Koller use "meaning" to be a linkage between language and anything else (typically stuff in the world). Their "octopus test" shows how knowing patterns in language won't necessarily let you generalize to the world. ImageImage
I guess I lean agnostic on "meaning" because there ARE cognitive theories of meaning that seem accessible to large language models--and they happen to be some of the most compelling ones.
One is that meaning is determined, at least in part, by the relationships between concepts as well as the role they play in a bigger conceptual theory.
To use Ned Block's example, "f=ma" in physics isn't really a definition of force, nor is it a definition of mass, or acceleration. It sorta defines all three. You can’t understand any one of them without the others. Image
The internal states of large language models might approximate meanings in this way. In fact, their success in semantic tasks suggests that they probably do -- and if so, what they have might be pretty similar to people (minus physical grounding).
To be sure (as in the octopus example) conceptual roles don't capture *everything* we know. But they do capture *something*. And there are even examples of abstract concepts (e.g. "prime numbers") where that something is almost everything.
It’s also hard to imagine how large language models could generate language or encode semantic information without at least some pieces of conceptual role (maybe real meaning!) being there. All learned from language.
Conceptual roles are probably what allows us, ourselves, to talk about family members we've never met (or atoms or dinosaurs or multiverses). Even if we know about them just from hearing other people talk. Image
For the big claim.... Not a popular view, but there's some case that consciousness is not that interesting of a property for a system to possess.
Some model happens to have representations of its own representations, and representations of its representations of representations (some kind of fancy fixed point combinator?)....
... and so what!

Why care if it does, why care if it doesn't.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with steven t. piantadosi

steven t. piantadosi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @spiantado

Jan 25
I am sooooooo excited for this paper. We've spent years developing a super fast program induction library. We use it to learn key pieces of language structure.

So much of what Chomskyan linguists say about learnability is totally wrong.


We show that a program-learning model can construct grammars and other computational devices from just observing utterances (sentences). It takes just a tiny amount of data to learn key patterns and structures from natural language which have been argued to be innate/unlearnable
We also show that this kind of learning model can acquire the patterns used in artificial language learning experiments, like @jenny_saffran, Aslin, and Newport.
Read 30 tweets
Oct 28, 2021
Did you know that there are psychologists who study "stereotype accuracy"? I've always wondered what the hell, so I've been reading it recently. For the record, it’s exactly as bad as it sounds.

Here’s a thread.
I started with @PsychRabble et al's review paper, "Stereotype accuracy: one of the largest and most replicable effects in all of social psychology." It claims the correlation between stereotypes and reality is stronger than most effects in psychology.
Many papers on "stereotype accuracy" seem to begin by complaining that everyone assumes stereotypes are inaccurate (e.g. the word "stereotype" is *defined* to mean inaccurate) without evidence.
Read 46 tweets
Jan 28, 2021
The GameStop $GME news is too funny not to write a thread about.

It's also the most interesting intersection of psychology and politics going on.
I think the best overview of what's happening is this NYT article:
Basically, some large companies bet that GameStop stock would go down. To most financial people, this would seem like a good bet: most video games are sold online now and physical stores are especially hurt by the pandemic. Video game stores aren't looking great.
Read 24 tweets
Nov 9, 2020
A new paper on (i) how to connect high-level cognitive theories to neuroscience / neural nets, and (ii) how learners can construct new concepts like number or logic without presupposing them. "Church encoding" is the metaphor we need.
Kids learn concepts like number or logic, but it's not really clear how that's actually possible: what could it mean for a learner (or computer!) to *not* have numbers or logic? Can you build a computer without them? And what do these abstract things mean on a neural level?
Church encoding is an idea from mathematical logic where you use one system (lambda calculus) to represent another (e.g. boolean logic). It's basically the same idea as using sets to build objects that *act like integers*, even though they're, well, sets.
Read 21 tweets
Oct 2, 2020
This article by @JeffFynnPaul has been going around. He argues that it’s a “myth” that Europeans took land from Native Americans. @cakrolik and I read it and it’s one of the worst argued pieces we’ve ever seen. Here is our thread.
The overall claim of his article is that liberals have perpetuated a “Myth of the Stolen Country” that the US “was founded by a monumental act of genocide, accompanied by larceny on the grandest scale.”
He compares the popularity of this “myth” in the US to ideology in Nazi Germany (Godwin’s law?) and communism. Image
Read 58 tweets
Sep 14, 2020
I am so excited that this new paper with @samisaguy is out. We explain how humans perceive number by mathematically deriving an optimal perceptual system with bounded information processing. Four N=100 pre-registered experiments match the predictions.
nature.com/articles/s4156… Image
People have long hypothesized that there are two systems of number perception: an exact system that lets us perfectly perceive numbers less than about 4, and an approximate one for large numbers. Image
This observation dates back to William Jevons, who published an experiment in Nature in 1871 where he literally threw beans into a box and saw how accurately he guessed the number. He estimated low numbers perfectly, but could only approximate those above 4 or 5. ImageImage
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!


0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy


3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!