Building in AI + security | Stanford PhD in AI & Cambridge physics | ex-GDM and Anthropic | alignment + progress + growth | 🇺🇸🇨🇿
Feb 15 • 11 tweets • 4 min read
We discovered a surprising, training-free way to generate images: no GANs or diffusion models, but a ✨secret third thing✨! Standard models like CLIP can already create images directly, with zero training. We just needed to find the right key to unlock this ability = DAS
1/11
The key insight:
Previous attempts to make CLIP generate images produced noisy adversarial patterns 🌫️.
We found a way to get interpretable generations by decomposing the optimization across multiple scales (1×1 to 224×224) All this on top of a frozen discriminative model 2/11
Sep 16, 2022 • 6 tweets • 4 min read
I found the Git Re-Basin paper (arxiv.org/abs/2209.04836) by @SamuelAinsworth, J. Hayase & @siddhss5 *really* intriguing. So I made a replication in Colab reusing bits of their code but unfortunately couldn't reproduce the key conclusion 🚨😱
In neither case was the Network 2 + permutations in the same linearly connected low loss convex basin with Network 1 (=the key result) 🧩
2/5
Oct 30, 2020 • 7 tweets • 6 min read
Excited to share our new #neurips2020 paper /Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel/ (arxiv.org/abs/2010.15110) with @KDziugaite, Mansheej, @SKharaghani, @roydanroy, @SuryaGanguli 1/6
We Taylor-expand Deep Neural Network logits with respect to their weights at different stages of training & study how well a linearized network trains based on at which epoch it was expanded. Early expansions train poorly, but even slightly into training they do very well! 2/6