Tweet

steven t. piantadosi

Jan 25 • 30 tweets • 9 min read

I am sooooooo excited for this paper. We've spent years developing a super fast program induction library. We use it to learn key pieces of language structure.

So much of what Chomskyan linguists say about learnability is totally wrong.

🧵

pnas.org/content/119/5/…

We show that a program-learning model can construct grammars and other computational devices from just observing utterances (sentences). It takes just a tiny amount of data to learn key patterns and structures from natural language which have been argued to be innate/unlearnable

@jenny_saffran

We also show that this kind of learning model can acquire the patterns used in artificial language learning experiments, like @jenny_saffran, Aslin, and Newport.

It even works when you give it strings from a simple toy grammar like you might find in an intro linguistics textbook.

This required developing a super fast C++ implementation of "language of thought" models, distributed under GPL. It can run, for example, over 1,000,000 samples per second on Goodman et al's rational rules model.
github.com/piantado/Fleet

@AndyPerfors

Our work was very much inspired by prior models and analyses, including notably this paper from @AndyPerfors. His work showed that a little bit of a parent's speech would be enough for a child to discover that language has hierarchical structure.
perfors.net/publication/pe…

(Hierarchical structure is the idea that sentences contain other units, even other sentences. The sentence "John doubted that I cooked lasagna" contains the sentence "I cooked lasagna." Hierarchy helps us express all kinds of complex ideas.)

The Perfors result matters because one of the key things language science does NOT know is whether rules and structures–like hierarchy–are innate in people. Perfors et al. showed it could be learned via Bayesian model comparison of possible grammars.

That work generated a lot of debate. A really interesting critique of the Perfors model was that a Bayesian model comparison between grammars builds in MORE because you have to "build in" additional grammars to compare. Here's Norbert Hornstein:

This critique trips you into wonderland: a learner who considers more options has to have more built in (i.e. the options). So the LEAST nativist theory would be building in just one grammar, universal grammar!

You might suspect something is wrong with this logic.

And you are right. The subtle mistake helped motivate our paper.

We illustrate it with Jorge Luis Borges's story, The Infinite Library. Borges imagines a library containing every *possible* book (all possible sequences of characters). His depiction is chilling and gorgeous.
maskofreason.files.wordpress.com/2011/02/the-li…

Here's a question: how much information does this library contain?

If it contains every possible book, the library must have infinite information, right?

In fact, the library doesn't contain much information at all. You can tell because I communicated its entire contents in less than a tweet: "all possible sequences of characters"

The analogy for learning and innateness is to replace "all possible books" with "all possible computations." Learners who pick out the right target linguistic system out of all possible computations needn't have very much "built in" at all...

... because the space of "all possible computations" is easy/concise to describe. And it's useful not just for language, but for everything humans do. We simply know that people can represent and internalize complex computational processes.
cell.com/trends/cogniti…

This theory--that smart learners should understand the world by identifying what computations generated the data they see--is very old. There is work on induction by Solomonoff in the 1960s, and more recent theoretical work by Hutter.
en.wikipedia.org/wiki/Solomonof…

These kinds of learners just do what scientists do: they look at data and try to build a logical theory to explain it. They write their theory as a computer program, and can basically learn a theory of whatever they see.

Such powerful learning sounds contrary to "poverty of the stimulus" arguments in language acquisition. Those arguments generally say that the data kids see isn't enough to tell them the rules of language. This is a nice overview by Lisa Pearl:
ling.auf.net/lingbuzz/004646

@NickJChater

And, in fact, such powerful learning does negate many "poverty of the stimulus" arguments. In particular, @NickJChater and Vitanyi showed that learners could discover the rules of language just from observation, exactly as was claimed to be impossible.
sciencedirect.com/science/articl…

Our work is very much an implementation of Chater & Vitanyi's ideas. Implementation is what required us to build a fancy program induction library.

The results are cool: the model can not only learn key natural language patterns, but it can *construct* computational devices of differing levels of computational power, all out of a small set of assumed (programming) primitives -- just as good explanations of the data it sees.

But note that the model isn't TOLD beforehand to look for grammars or computations of a certain type. All it's told is to build something simple to explain what you see. This is why essentially the same model works in a lot of different domains, from arithmetic to kinship systems

These kinds of models and theories should inform linguistic theorizing. There are quite a few suspect arguments that are consistently made about language acquisition.

For example, linguistic textbooks--including this one I had in undergrad!---have claimed that learning productive rules of language is simply impossible.
This model shows that's just nonsense.

Many people have claimed that the fact children learn language so quickly is evidence that these structures are innate. But it's not right. The model can often learn these kinds of patterns from *tens* of tokens, even out of an unrestricted space of options.

The fact that language is "infinite" (there doesn't seem to be an upper bound on sentence length in English) has also been hypothesized to be innate. But it turns out, finite language often HARDER to learn because they have less concise descriptions

I often hear that these kinds of models are implausible because there are things that real kids won't learn. But that's also true for this model and every one like it. So seeing it in kids doesn't tell you anything about whether language structures are built in to human.

And, maybe most strikingly, some have claimed that the fact that kids systematically generalize about stuff they haven't seen before is evidence for innate linguistic structure. But it's not. This model does that too. We describe why.

@weGotlieb

While the examples studied here are relatively simple in terms of structure, a few weeks ago @weGotlieb and @roger_p_levy came out with a super exciting model that learns deep facts about syntax (like island constraints) without domain-specific knowledge.

https://twitter.com/roger_p_levy/status/1466214575737647109?s=20

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Read 14 tweets

steven t. piantadosi

@spiantado

Jul 29, 2020

https://twitter.com/abralin_oficial/status/1288465678215897090

Super excited to talk tomorrow (July 30th, 3pm pacific) at Abralin ao Vivo about joint with Yuan Yang. I'll be presenting a long-running project on language acquisition that tackles language learnability questions with Bayesian program learning tools.

Here's a summary thread.

https://twitter.com/abralin_oficial/status/1288465678215897090

@abralin_oficial

This is part of an amazing remote talk series by @abralin_oficial presenting language work all summer
abralin.org/site/en/evento…

Our project studies how program learning tools can acquire natural language structures from positive evidence alone. We show that learners can *construct* grammatical devices for producing finite-state, context-free, and context-sensitive grammars to explain data they see.

Read 17 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

steven t. piantadosi

Try unrolling a thread yourself!

More from @spiantado

steven t. piantadosi

steven t. piantadosi

steven t. piantadosi

steven t. piantadosi

steven t. piantadosi

steven t. piantadosi

Did Thread Reader help you today?

Like this author's thread?