Tweet

Alex Tamkin

8 Dec, 13 tweets, 5 min read

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

SSL is a promising technology, but current methods are field-specific. Can we find general algorithms that can be applied to any domain?

🌐: dabs.stanford.edu
📄: arxiv.org/abs/2111.12062

🧵👇 #NeurIPS2021

1/

Self-supervised learning (SSL) algorithms can drastically reduce the need for labeling by pretraining on unlabeled data

But designing SSL methods is hard and can require lots of domain-specific intuition and trial and error

2/

We designed DABS to drive progress in domain-agnostic SSL

Our benchmark addresses three core modeling components in SSL algorithms:

(1) architectures
(2) pretraining objectives
(3) transfer methods

3/

1) Architectures:

Most models are designed for particular modalities (e.g. ResNets for images)

But Transformers have recently been applied to many settings, and Perceivers are even more general

What architectures are general, efficient, and learn the best representations?

4/

2) Pretraining objectives:

We currently have domain-specific ways to extract signal from unlabeled data

Language modeling prevails in NLP, while contrastive learning is more common in vision

Can we uncover unifying principles and methods that work well on any domain?

5/

3) Transfer learning

Full finetuning, linear evaluation, p/prompt/prefix tuning… there's a whole range of techniques to adapt models to downstream tasks.

Do these work equally well across domains? What are the tradeoffs, and do better methods exist?

6/

Datasets & Domains

DABS is organized into 7 domains: natural images, speech, English-language text, multilingual text, wearable sensors, chest x-rays, and images w/ text descriptions.

Each domain has an unlabeled dataset for pretraining and downstream datasets for transfer

7/

The goal is to find a *single* SSL algorithm that performs well across all of these domains

We kick off the challenge with two new baselines using transformers, where the pretraining objectives are based on the input embeddings. There's a lot of headroom left!

8/

To assess real-world generalization, DABS is a *living benchmark*—

We'll be adding additional domains focusing on scientific and other real-world applications

Proposed algorithms will be tested on these new domains to see how well they hold up

9/

We hope DABS helps yield new insights about why / when SSL works, and helps make it a more mature technology that can be used off-the-shelf in scientific, medical, and other high-impact fields

10/

Also—If you're a domain expert interested in adding a domain for your field (unlabeled dataset + labeled downstream tasks), please reach out!

11/

@StanfordAILab

This is joint work w/ Vincent Liu, Rongfei Lu, Daniel Fein, Colin Schultz, and Noah Goodman! @StanfordAILab @stanfordnlp

Stop by our #NeurIPS2021 poster on Friday, 8:30–10am PST 👋
neurips.cc/virtual/2021/p…

12/

Website: dabs.stanford.edu
Paper: arxiv.org/abs/2111.12062
Code: github.com/alextamkin/dabs

🌅13/13

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @AlexTamkin

Alex Tamkin

@AlexTamkin

7 Dec

@Patterns_CP

Love the "data science maturity levels" in @Patterns_CP

Interesting way to contextualize research at a glance (reminds me a bit of @justsaysinmice)

Full list in thread:

1) Concept

Basic principles of a new data science output observed and reported (e.g., statement of principles, dataset, new algorithm, new theoretical concept, theoretical system infrastructure)

2) Proof-of-concept

Data science output has been formulated, implemented, and tested for one domain/problem (e.g., dataset with rich domain-specific metadata, algorithm coded up as software, principles with expanded guidance on how to implement them)

Read 7 tweets

Alex Tamkin

@AlexTamkin

11 Jan

@openai

Some takeaways from @openai's impressive recent progress, including GPT-3, CLIP, and DALL·E:

[THREAD]

👇1/

1) The raw power of dataset design.

These models aren't radically new in their architecture or training algorithm

Instead, their impressive quality is largely due to careful training at scale of existing models on large, diverse datasets that OpenAI designed and collected.

2/

Why does diverse data matter? Robustness.

Can't generalize out-of-domain? You might be able to make most things in-domain by training on the internet

But this power comes w/ a price: the internet has some extremely dark corners (and these datasets have been kept private)

3/

Read 13 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Alex Tamkin

Try unrolling a thread yourself!

More from @AlexTamkin

Alex Tamkin

Alex Tamkin

Did Thread Reader help you today?

Like this author's thread?