Michael C. Frank Profile picture
Cognitive scientist at Stanford. Open science advocate. @stanfordsymsys director. Bluegrass picker, slow runner, dad.
Apr 19, 2023 18 tweets 6 min read
Can a large language model be used as a "cognitive model" - meaning, a scientific artifact that helps us reason about the emergence of complex behavior and abstract representations in the human mind? My answer is YES.

Why and under what conditions? 🧵 Figure showing a computatio... A scientific model represents part or a whole of a particular system of interest, allowing researchers to explore, probe, and explain specific behaviors of the system. plato.stanford.edu/entries/models…
Apr 10, 2023 16 tweets 6 min read
What does it mean for a large language model (LLM) to "have" a particular ability? Developmental psychologists argue about these questions all the time and have for decades. There are some ground rules. 🧵
diagram of developmental change referenced later in the post
diagram of abstractions linking to observations referenced later in the post
This thread builds on my previous thread about general principles for LLM evaluation. Here I want to talk specifically about claims about the presence of a particular ability (or relatedly, an underlying representation or abstraction).
Apr 4, 2023 16 tweets 4 min read
People are testing large language models (LLMs) on their "cognitive" abilities - theory of mind, causality, syllogistic reasoning, etc. Many (most?) of these evaluations are deeply flawed. To evaluate LLMs effectively, we need some principles from experimental psychology.🧵 Just to be clear, in this thread I'm not saying that LLMs do or don't have *any* cognitive capacity. I'm trying to discuss a few basic ground rules for *claims* about whether they do.
Mar 27, 2023 19 tweets 6 min read
How do we compare the scale of language learning input for large language models vs. humans? I've been trying to come to grips with recent progress in AI. Let me explain these two illustrations I made to help. 🧵 ImageImage Recent progress in AI is truly astonishing, though somewhat hard to interpret. I don't want to reiterate recent discussion, but @spiantado has a good take in the first part of lingbuzz.net/lingbuzz/007180; l like this thoughtful piece by @MelMitchell1 as well: pnas.org/doi/10.1073/pn…
Mar 22, 2023 15 tweets 3 min read
My lab held a hackathon yesterday to play with places where large language models could help us with our research in cognitive science. The mandate was, "how can these models help us do what we do, but better and faster."

Some impressions:🧵 Whatever their flaws, chat-based LLMs are astonishing. My kids and I used ChatGPT to write birthday poems for their grandma. I would have bet money against this being possible even ten years ago.

But can they be used to improve research in cognitive science and psychology?
Jan 20, 2023 13 tweets 7 min read
Do you want to do a psychology experiment while following best practices in open science? My collaborators and I have created Experimentology, a new open web textbook (to be published by MIT Press but free online forever).

experimentology.io

Some highlights! 🧵 The book is intended for advanced undergrads or grad students, and is designed around the flow of experimental project - from planning through design, execution, and reporting, with open science concepts like reproducibility, data sharing, and preregistration woven throughout. Experiment sketch (population, sample, experimental design,
Jan 7, 2019 12 tweets 8 min read
For two years, @mbraginsky, @danyurovsky, Virginia Marchman, and I have been working on a book called "Variability and Consistency in Early Language Learning: The Wordbank Project" (@mitpress).

Here's our draft: langcog.github.io/wordbank-book/…

[+ a thread with a few of our findings] We look at child language using a big dataset of parent reports of children's vocabulary from wordbank.stanford.edu, w/ 75k kids and 25 languages. (Data are from MacArthur-Bates CDI and variants). Surprisingly, parent report is both reliable and valid! langcog.github.io/wordbank-book/…
Sep 24, 2018 12 tweets 7 min read
What is "the open science movement"? It's a set of beliefs, research practices, results, and policies that are organized around the central roles of transparency and verifiability in scientific practice. An introductory thread. /1 The core of this movement is the idea of "nullius in verba" - take no one's word for it. The distinguishing feature of science on this account is the ability to verify claims. Science is independent of the scientist and subject to skeptical inquiry. /2

en.wikipedia.org/wiki/Mertonian….
Aug 6, 2018 7 tweets 2 min read
A thought on grad advising. When I was a second year, an announcement went out to our dept. with the abstract for a talk I was giving in the area talk series. A senior faculty member wrote back with a scathing critique (cc'd to my advisor, @LanguageMIT). /1 The part that made the biggest impression on me: they said that the first line of my abstract was *so embarrassing that they thought my graduate training had failed*! Actual quote: "You look naive at best, many other things at worst." And on from there. /2
Jul 2, 2018 19 tweets 6 min read
Prosocial development throwdown at #icis18: presentations by Audun Dahl, Felix Warneken, and @JKileyHamlin. Three opinions on a fascinating topic! [livetweet thread] Dahl up first. Puzzles of prosociality: there’s an amazing ability to help others prosocially from an early age, but some infants don’t! Why? Behaviors emerge via 1) social interest and 2) socialization.
Jun 22, 2018 9 tweets 5 min read
Everyone makes mistakes during data analysis. Literally everyone. The question is not what errors you make, it's what systems you put into place to prevent them from happening. Here are mine. [a thread because I'm sad to miss #SIPS2018]

A big wakeup call for me was an errror I made in this paper: langcog.stanford.edu/papers/FSMJ-de…. Figure 1 is just obviously wrong in a way that I or my co-authors or the reviewers should have spotted. Yet we all missed it completely. Here's the erratum.
onlinelibrary.wiley.com/doi/abs/10.111…