CS PhD student with @sanmikoyejo at @stai_research @StanfordAILab
May 1 • 13 tweets • 5 min read
What happens when generative models are trained on their own outputs?
Prior works foretold of a catastrophic feedback loop, a curse of recursion, descending into madness as models consume their own outputs. Are we poisoning the very data necessary to train future models?
1/N
Excited to announce our newest preprint!
Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data
w/ @MGerstgrasser @ApratimDey2 @rm_rafailov @sanmikoyejo @danintheory @Andr3yGR @Diyi_Yang David Donoho
A few weeks ago, Stanford AI Alignment @SAIA_Alignment read @AnthropicAI 's "Superposition, Memorization, and Double Descent." Double descent is relatively easy to describe, but **why** does double descent occur?
1/8 transformer-circuits.pub/2023/toy-doubl…
@SAIA_Alignment @AnthropicAI Prior work answers why double descent occurs, but we wanted an intuitive explanation that doesn’t require RMT or stat mech. Our new preprint identifies, interprets the **3** necessary ingredients for double descent, using ordinary linear regression!
Very excited to announce our #NeurIPS2022 paper No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit.
It's a story about NeuroAI, told through a story about grid & place cells.
Joint w/ @KhonaMikail@FieteGroup 1/15 @KhonaMikail@FieteGroup The promises of deep learning-based models of the brain are that they (1) shed light on the brain’s fundamental optimization problems/solutions, and/or (2) make novel predictions. We show, using deep network models of the MEC-HPC circuit, that one may get neither! 2/15
Jul 23, 2022 • 15 tweets • 15 min read
If you’re interested in deep learning (DL) and neuroscience, come to our poster at @AI_for_Science’s #ICML2022 workshop
**No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit**
Joint w/ @KhonaMikail@FieteGroup 1/13 @AI_for_Science@KhonaMikail@FieteGroup The central promise of DL-based models of the brain are that they (1) shed light on the brain’s fundamental optimization problems/solutions, and/or (2) make novel predictions. We show, using DL models of grid cells in the MEC-HPC circuit, that one often gets neither 2/13