Latest Twitter Threads by @sarahookr on Thread Reader App

Apr 30 • 13 tweets • 5 min read

It is critical for scientific integrity that we trust our measure of progress.

The @lmarena_ai has become the go-to evaluation for AI progress.

Our release today demonstrates the difficulty in maintaining fair evaluations on @lmarena_ai, despite best intentions.

We spent 5 months analyzing 2.8M battles on the Arena, covering 238 models across 43 providers.

We show that preferential policies engaged in by a handful of providers lead to overfitting to Arena-specific metrics rather than genuine AI progress.

Oct 4, 2024 • 11 tweets • 3 min read

One of the biggest open questions is what is the limit of synthetic data.

Does training of synthetic data lead to mode collapse?

Or is there a path forward that could outperform current models?

What is missing from this conversation is that the success of synthetic data hinges on how you optimize in the data space.

A few recent papers highlight this tension well, on the side of dangers of synthetic data -- excellent paper released in Nature.

📜nature.com/articles/s4158…

Jul 23, 2021 • 4 tweets • 4 min read

How do you distinguish between sources of uncertainty?

This is important because the downstream remedies for atypical and noisy examples are very different.

Two of our workshop papers explore this from different perspectives.

In subset ML network tomorrow, Neil Hu and Xinyu Hu explore where simply prioritizing challenging examples fails -- motivating a more nuanced distinction between sources of uncertainty.

w @jasonyo, @savvyRL

Workshop: bit.ly/3wXnrNT

Paper 📜: bit.ly/36ZIhlj

Feb 15, 2021 • 7 tweets • 3 min read

Yesterday, I ended up in a debate where the position was "algorithmic bias is a data problem".

I thought this had already been well refuted within our research community but clearly not.

So, to say it yet again -- it is not just the data. The model matters.

1/n We show this in our work on compression.

Pruning and quantizing deep neural networks amplifies algorithmic bias.

arxiv.org/abs/2010.03058 and arxiv.org/abs/1911.05248

Nov 21, 2019 • 8 tweets • 2 min read

What does a pruned deep neural network "forget"?

Very excited to share our recent work w Aaron Courville, Yann Dauphin and @DreFrome

weightpruningdamage.github.io

At face value, deep neural network pruning appears to promise you can (almost) have it all — remove the majority of weights with minimal degradation to top-1 accuracy. In this work, we explore this trade-off by asking whether certain classes are disproportionately impacted.

Share this page!

Enter URL or ID to Unroll