Some takeaways from @openai's impressive recent progress, including GPT-3, CLIP, and DALL·E:

[THREAD]

👇1/
1) The raw power of dataset design.

These models aren't radically new in their architecture or training algorithm

Instead, their impressive quality is largely due to careful training at scale of existing models on large, diverse datasets that OpenAI designed and collected.

2/
Why does diverse data matter? Robustness.

Can't generalize out-of-domain? You might be able to make most things in-domain by training on the internet

But this power comes w/ a price: the internet has some extremely dark corners (and these datasets have been kept private)

3/
As @sh_reya puts it, the "data-ing" is often more important than the modeling

And @openai put *painstaking* effort into the data-ing for these models.

4/
2) The promise of "impact teams."

Teams that are moderately large, well-resourced, and laser-focused on an ambitious objective can accomplish a lot

5/
The @openai teams are multidisciplinary—different members work on algorithms, data collection, infra, evaluation, policy/fairness, etc

This is hard in academia, not just b/c of resources but also incentives—academia often doesn't assign credit well to members of large teams

6/
3) Soul-searching for academic researchers?

A lot of people around me are asking: what can I do in my PhD that will matter?

@chrmanning has a useful observation—we don't expect AeroAstro PhD students to build the next airliner

7/
I'm also optimistic here—I think there a lot of opportunity for impact in academia, including advancing:
- efficiency
- equity
- domain-agnosticity
- safety + alignment
- evaluation
- theory
…and many other as-of-yet undiscovered phenomena!

8/
4) We're entering a less open era of research.

Companies don't want to release these models, both because of genuine safety concerns but also potentially because of their bottom line

9/
Models are locked behind APIs, datasets are kept internal, and the public may only get to see a polished (but restricted) demo + blog post

10/
Limiting API access has safety benefits, but could also be an extra advantage to the well-connected: established researchers or those with large Twitter followings

11/
Even when papers are published, important details are missing (e.g. key details of GPT-3's architecture or the data collection process)

It's becoming increasingly hard to study/improve these methods—just as they're edging closer and closer to widespread productionization.

12/
Ultimately, no one lab can do this alone—

We need smart new frameworks and mutual trust to overcome coordination challenges and ensure positive outcomes for society

🌉13/13

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex Tamkin

Alex Tamkin Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!