Sara Hooker Profile picture
I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.
2 subscribers
Oct 4, 2024 11 tweets 3 min read
One of the biggest open questions is what is the limit of synthetic data.

Does training of synthetic data lead to mode collapse?

Or is there a path forward that could outperform current models? Image What is missing from this conversation is that the success of synthetic data hinges on how you optimize in the data space.

A few recent papers highlight this tension well, on the side of dangers of synthetic data -- excellent paper released in Nature.

📜nature.com/articles/s4158…
Jul 23, 2021 4 tweets 4 min read
How do you distinguish between sources of uncertainty?

This is important because the downstream remedies for atypical and noisy examples are very different.

Two of our workshop papers explore this from different perspectives. In subset ML network tomorrow, Neil Hu and Xinyu Hu explore where simply prioritizing challenging examples fails -- motivating a more nuanced distinction between sources of uncertainty.

w @jasonyo, @savvyRL

Workshop: bit.ly/3wXnrNT

Paper 📜: bit.ly/36ZIhlj
Feb 15, 2021 7 tweets 3 min read
Yesterday, I ended up in a debate where the position was "algorithmic bias is a data problem".

I thought this had already been well refuted within our research community but clearly not.

So, to say it yet again -- it is not just the data. The model matters.

1/n
We show this in our work on compression.

Pruning and quantizing deep neural networks amplifies algorithmic bias.

arxiv.org/abs/2010.03058 and arxiv.org/abs/1911.05248
Nov 21, 2019 8 tweets 2 min read
What does a pruned deep neural network "forget"?

Very excited to share our recent work w Aaron Courville, Yann Dauphin and @DreFrome

weightpruningdamage.github.io At face value, deep neural network pruning appears to promise you can (almost) have it all — remove the majority of weights with minimal degradation to top-1 accuracy. In this work, we explore this trade-off by asking whether certain classes are disproportionately impacted.