Alexander Huth Profile picture
Interested in how & what the brain computes. Assistant professor CS & Neuro @UTAustin. Married to the incredible @Libertysays. he/him
Sean Marrett Profile picture 1 subscribed
May 24, 2023 10 tweets 6 min read
Multimodal transformers make it possible to transfer fMRI encoding models between language and vision! (though mostly from L->V and not V->L 🤔) New paper from @jerryptang @_du_meng @vvobot @vasudev_lal arxiv.org/abs/2305.12248 Image We used the BridgeTower multimodal transformer to extract representation of movie stimuli (from each frame) and story stimuli. BT learns repr's that are nicely structured for image-text matching, etc. arxiv.org/abs/2206.08657 Image
May 22, 2023 4 tweets 2 min read
The expert has logged on Image first, this is nuts because there's already been a ton of great work in both vision & language neuroscience that uses recent AI developments (LLMs, CNNs, etc.), cf the work by @c_caucheteux @JeanRemiKing @martin_schrimpf @ev_fedorenko @shaileeejain @GoldsteinYAriel and many more
May 22, 2023 8 tweets 4 min read
We know language models (& audio LMs) are good at predicting fMRI brain responses, but how much better are big models than small ones? Here @RichardAntone13 @_avaidya show that big models (& big datasets!) make brain prediction MUCH better arxiv.org/abs/2305.11863 Scaling LM size (here within the OPT family) gives roughly log-linear improvement. Big models give ~15% boost in performance over more typical GPT/2-scale models (+22% var. exp.). (The biggest models have so many features that fMRI model fitting seems to suffer a bit, though.) Image
May 1, 2023 10 tweets 4 min read
In the latest paper from my lab, @jerryptang showed that we can decode language that a person is hearing (or even just thinking) from fMRI responses. nature.com/articles/s4159… Our decoder uses neural network language models to predict brain activity from words. So we guess words and then check how well the corresponding predictions match the brain. It seems pretty good at capturing the "gist" of things while not getting the exact words correct. Image
Oct 28, 2021 15 tweets 5 min read
At long last, Dr. @sara_poppop's paper on aligned visual and linguistic semantic representations is out! nature.com/articles/s4159…
I want to briefly explain the context for Sara's work, and why I think this is the most important science that I've ever been a part of ⤵️ Back in 2012 we used natural movies to show that most of "higher" VC is semantically selective (sciencedirect.com/science/articl…). Similarly, in 2016 we used natural language to show that a huge fraction of association cortex is also semantically selective (ncbi.nlm.nih.gov/pmc/articles/P…)
Jul 21, 2021 5 tweets 2 min read
In our lab’s newest preprint (arxiv.org/abs/2106.05426), we used transfer learning between 100 different language representations to show that the SPACE OF REPRESENTATIONAL SPACES seems to be fundamentally low-dimensional. To analyze language, it’s common to use representations that express different types of information. These representations are often grouped into categories—like “syntactic” or “semantic”—implying there is low-dimensional structure in the space of language representations.