Latest Twitter Threads by @rtaori13 on Thread Reader App

Sep 15, 2022 • 6 tweets • 3 min read

🎉 The last few weeks have seen the release of #StableDiffusion, #OPT, and other large models.

⚠️ But should we be concerned about an irreversible influx of AI content on the internet?

⚙️ Will this make it harder to collect clean training data for future AI models?

🧵👇 1/6

(thread based on recent work arxiv.org/pdf/2209.03942…)

Q: So what’s the root issue?

A: Biases in AI models will be represented in their outputs, which become *training data* for future models! (if we’re not careful).

These feedback cycles have the potential to get nasty.

2/6

Dec 8, 2020 • 8 tweets • 4 min read

Reliability is a key challenge in ML. There are now dozens of robust training methods and datasets - how do they compare?

We ran 200+ ImageNet models on 200+ test sets to find out.
modestyachts.github.io/imagenet-testb…

TDLR: Distribution shift is *really* hard, but common patterns emerge.

To organize the 200 distribution shifts, we divide them into two categories: synthetic shifts and natural shifts.

Synthetic shifts are derived from existing images by perturbing them with noise, etc.

Natural shifts are new, unperturbed images from a different distribution.

Share this page!

Enter URL or ID to Unroll