Cofounded & running @ml_collective.
Host of Deep Learning Classics & Trends.
Research at Google DeepMind.
DEI/DIA Chair of ICLR & NeurIPS.
Writing https://t.co/IbycyGfnDR
Oct 9, 2024 • 10 tweets • 3 min read
Last day of the very special @COLM_conf !! Surprise, surprise, I am actually here to present a poster, than just tweet 😆
Stop by poster #3 this afternoon if you want to learn about training LMs entirely, and from scratch, on knowledge graphs! Why, How and What we learned.
@COLM_conf First, why? As old-fashioned ML researchers we were just very frustrated how **it's never clear what knowledge content is consumed in LM training**. The training data of regular LMs are huge, messy, inaccessible, ambiguous, & do not have a clear boundary of knowledge separation.
Jan 24, 2023 • 7 tweets • 6 min read
Now that we can write Tiny Papers @iclr_conf, what should we write about?
I'd like to invite all established researchers to contribute Tiny Ideas as inspirations, seeds for discussions & future collaborations! #TinyIdeasForTinyPapers
I'll start. Note: bad ideas == good starts.
1. Calibrate-before-train: before training every model with *data*, train them with noise to calibrate: loss function is to make sure they output "chance probability" — calibrate a model to be as neutral as possible before training starts. Does it help? Why or why not?
Jun 25, 2022 • 6 tweets • 3 min read
A quick thread on "How DALL-E 2, Imagen and Parti Architectures Differ" with breakdown into comparable modules, annotated with size 🧵 #dalle2#imagen#parti
* figures taken from corresponding papers with slight modification
* parts used for training only are greyed out
By now we know that
- DALL-E & Imagen = diffusion; Parti = autoregressive
- Imagen & Parti use generic text encoders; DALLE uses CLIP enc
But in fact, one version of Imagen also used CLIP, one version of DALL-E also had AR prior. So there are more connections than it seemed.
Dec 13, 2020 • 20 tweets • 10 min read
Favorite #NeurIPS2020 presentations and posters this year
PS: heavily biased by what I happened to catch and whom I happened to talk to
PPS: still catching up on talks so the list is rather incomplete and I'd hope to grow
PPPS: with contributions from @ml_collective members
[Talk] No. 1 has to go to -- keynote talk by @isbellHFh@mlittmancs et al simply brilliant 🎉🎉 slideslive.com/38935825/you-c…