Dileep George Profile picture
AGI research @DeepMind. Ex cofounder & CTO @vicariousai (acqd by Alphabet) and @Numenta. Triply EE (BTech IIT-Mumbai, MS&PhD Stanford). #AGIComics
Jul 7, 2023 20 tweets 7 min read
In-context learning (ICL) is a fascinating property of transformers. We seek to de-mystify ICL as a combination of 1️⃣ learning template circuits 2️⃣ context-sensitive retrieval of those templates and 3️⃣ rebinding appropriate slots in the templates. 🧵 1/N

https://t.co/IuMMuHjh8jarxiv.org/abs/2307.01201
We do this by showing that another interpretable recurrent sequence model, Clone-structured causal graphs (CSCG), shows ICL via the introduction of fast rebinding. CSCGs were used earlier in learning spatial structure from pure sequential observations. 2/arxiv.org/abs/2212.01508
May 23, 2023 8 tweets 3 min read
Successor representations (SR) is popular among hippocampus researchers and is often mentioned as a theory of place fields, cognitive maps etc.

I’m going to describe, in very simple terms, why SR is NOT a theory of place field formation or cognitive map learning. 🧵 Let us start with this figure from this excellent paper from @gershbrain ‘The successor representation, its computational logic’

On the right is the SR matrix M(states(t), states(t+1)). The figure is showing that if you reshape a column of this matrix, you get a place field. 2 Image
Mar 23, 2023 9 tweets 4 min read
Learning planning-compatible representations that enable strong transfer is important for AI. See our work on graph schemas that quickly transfer, compose, & modify prior experience to plan in new environments, using insights about hippocampus & PFC.arxiv.org/abs/2302.07350 🧵 We start with previous work on CSCGs, which can now learn cognitive maps of 3D rooms from ego-centric visual experience without Euclidean assumptions or ground truth locations, even in the presence of severe perceptual aliasing. 2/
arxiv.org/abs/2212.01508 Image
Dec 16, 2022 14 tweets 8 min read
Location, encoded by place cells, is important for animals, humans, … and airplanes.

Check out our new work: "Space is a latent sequence”. It will change the way you look at place cells, spatial representations, remapping, and knowledge transfer. 🧵

arxiv.org/pdf/2212.01508… Diverse latent topologies can be learned from a severely aliased egocentric sensory+motor sequence, without Euclidean assumptions, and without knowing the semantics of actions. Multiple environments can be learned in the same CSCG, and it can also stitch transitively. 2/
Dec 4, 2022 13 tweets 2 min read
You know English. You don’t know Malayalam. You decide to learn it by listening to podcasts that are purely in Malayalam. What will happen?

A thread about language models, meaning, understanding etc…. Plus, you get to learn Malayalam :-) 1/N First, by listening to Malayalam-only podcasts, you become very good at the word-to-word associations in Malayalam. You can quote long passages, mix them up convincingly, sing Malayalam songs etc.

But do you understand Malayalam at this point? Not yet. Why? 2/
May 16, 2021 6 tweets 3 min read
A thread of threads on our recent paper on cognitive maps...check it out if you are curious about how space and time are represented in the hippocampus, and how those insights could be used for AI. nature.com/articles/s4146… 1/ The core idea "space is a sequence" is discussed in this talk. Warning...it will change the way you think about the hippocampus :-) 2/
Apr 12, 2021 23 tweets 6 min read
Yes, a neuroscientist can understand a microprocessor...a rebuttal thread on the popular paper by @stochastician and @KordingLab

While I agree with some of the points, I worry the paper leaves a misleading impression about neuroscience experiments 1/23

journals.plos.org/ploscompbiol/a… I agree that just creating more complex datasets and analytics is unlikely to result in insights we need.

But in making the argument it:
1. casually dismisses large body of existing knowledge
2. doesn't recognize this was obtained by skillfully probing the brain 2/23
Sep 11, 2020 14 tweets 8 min read
Interested in the functional logic of cortical and thalamic microcircuits? Check out our new preprint from @vicariousai

It is an interconnected story of clonal neurons, dendritic computation, columns/laminae, CO blobs, thalamus, and inference in a generative model.. Thread 1/ Image We start with RCN, our previously published neuroscience-inspired generative vision model (science.sciencemag.org/content/358/63…) , and triangulate the inference computations in that model with data from neuroanatomy and physiology. The mutual constraints help slot in the puzzle pieces. 2/ Image
Jul 3, 2020 11 tweets 5 min read
Interesting additions to our 'Cognitive maps as structured graphs' preprint:
1. Biological circuit and mechanistic model
2. v. close reproduction of @ChenSun71122197's event-specific-repns in place cells
3. Detailed remapping experiments. Thread..
biorxiv.org/content/10.110… clone-structured cognitive graphs (CSCGs) are directed graphs that use a state cloning to represent higher-order dependencies. The clonal states receive identical observational evidence, but different contextual evidence. We learn CSCGs by formulating them using a cloned HMM.
Dec 7, 2019 14 tweets 6 min read
Are you skeptical about successor representations? Want to know how our new model can learn cognitive maps, context-specific representations, do transitive inference, and flexible hierarchical planning? #tweeprint...(1) @vicariousai @swaroopgj @rvrikhye biorxiv.org/content/10.110… As @yael_niv pointed out in her recent article, learning context specific representations from aliased observations is a challenge. Our agent can learn the layout of a room from severely aliased random walk sequences, only 4 unique observations in the room!
Mar 15, 2019 16 tweets 7 min read
We study bias-variance dilemma in class, but reading the paper gives important historical perspective. You'll see that many of the 'unreasonable effectiveness' and 'surprising findings' are predicted in this paper. Thread..(1) dam.brown.edu/people/geman/H… The paper asks the very important question -- given unlimited time (or "compute", as it is called these days) and training data, what sorts of tasks can be learned. Are there limits to such brute-force approaches? (2)
Jan 17, 2019 10 tweets 6 min read
Mini thread about the cognitive science and neuroscience inspirations behind our new paper in which we learn concepts as 'cognitive programs' on a 'visual cognitive computer'. robotics.sciencemag.org/content/4/26/e… Cognitive programs is inspired by Barsalou's Perceptual Symbol Systems (PSS) theory, which is also relevant for discussions on symbolic-connectionist integration. @GaryMarcus citeseerx.ist.psu.edu/viewdoc/downlo…