Latest Twitter Threads by @savvyRL on Thread Reader App

Oct 9, 2024 • 10 tweets • 3 min read

Last day of the very special @COLM_conf !! Surprise, surprise, I am actually here to present a poster, than just tweet 😆

Stop by poster #3 this afternoon if you want to learn about training LMs entirely, and from scratch, on knowledge graphs! Why, How and What we learned.

@COLM_conf First, why? As old-fashioned ML researchers we were just very frustrated how **it's never clear what knowledge content is consumed in LM training**. The training data of regular LMs are huge, messy, inaccessible, ambiguous, & do not have a clear boundary of knowledge separation.

Jan 24, 2023 • 7 tweets • 6 min read

Now that we can write Tiny Papers @iclr_conf, what should we write about?

I'd like to invite all established researchers to contribute Tiny Ideas as inspirations, seeds for discussions & future collaborations! #TinyIdeasForTinyPapers

I'll start. Note: bad ideas == good starts. 1. Calibrate-before-train: before training every model with *data*, train them with noise to calibrate: loss function is to make sure they output "chance probability" — calibrate a model to be as neutral as possible before training starts. Does it help? Why or why not?

Jun 25, 2022 • 6 tweets • 3 min read

A quick thread on "How DALL-E 2, Imagen and Parti Architectures Differ" with breakdown into comparable modules, annotated with size 🧵
#dalle2 #imagen #parti

* figures taken from corresponding papers with slight modification
* parts used for training only are greyed out

By now we know that
- DALL-E & Imagen = diffusion; Parti = autoregressive
- Imagen & Parti use generic text encoders; DALLE uses CLIP enc

But in fact, one version of Imagen also used CLIP, one version of DALL-E also had AR prior. So there are more connections than it seemed.

Dec 13, 2020 • 20 tweets • 10 min read

Favorite #NeurIPS2020 presentations and posters this year

PS: heavily biased by what I happened to catch and whom I happened to talk to
PPS: still catching up on talks so the list is rather incomplete and I'd hope to grow
PPPS: with contributions from @ml_collective members [Talk] No. 1 has to go to -- keynote talk by @isbellHFh @mlittmancs et al simply brilliant 🎉🎉
slideslive.com/38935825/you-c…

Share this page!

Enter URL or ID to Unroll