Tweet

Stephen Bach

Dec 1 • 3 tweets • 2 min read

How could you even begin to find theoretical guarantees for zero-shot learning? Our team takes a step in this direction, providing the first non-trivial theoretical guarantees for zero-shot learning with attributes.

Poster #329, Hall J, Thur. Dec. 1, 4-6 PM

[1/3] #NeurIPS22

@CriMenghini

Alessio, @CriMenghini, and the team show how to quantify the quality of the information in attribute-based descriptions of unseen classes. They show how to analyze such descriptions and give non-trivial lower bounds such that *no* algorithm can guarantee better performance. [2/3]

@CriMenghini

Paper: openreview.net/forum?id=tzNWh…
Poster: neurips.cc/virtual/2022/p…

P.S. @CriMenghini is on the job market this year! She's looking for industry research positions, and she has top-notch skills in both computational social science *AND* machine learning!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @stevebach

Stephen Bach

@stevebach

Sep 26

Recently, there has been a lot of interest in compositionality in large pre-trained models. We’re excited to share work led by Nihal Nayak and Peilin Yu on making learned prompts more compositional:
arxiv.org/abs/2204.03574

A 🧵👇

We focus on compositional zero-shot learning. The task is to label classes composed of primitive concepts that represent objects and attributes e.g., "old cat" vs "young cat" vs "young dog". Perhaps unsurprisingly, CLIP doesn't do a great job out of the box on this task.

(2/8)

To that end, we propose compositional soft prompting (CSP), a new prompt-tuning method to represent primitive concepts in a composable way.

(3/8)

Read 8 tweets

Stephen Bach

@stevebach

Apr 25

If you’re at #ICLR2022, hope you’ll check out our spotlighted poster: “Multitask Prompted Training Enables Zero-Shot Task Generalization.” arxiv.org/abs/2110.08207

Poster session 5, Tue 1:30-3:30 ET

@BigscienceW

This work was a big undertaking from many at the @BigscienceW Workshop, particularly @SanhEstPasMoi, @albertwebson, @colinraffel, and @srush_nlp. It’s been awesome to see all the people already using and building on the T0 family of models for zero-shot learning.

There’s rightly been a lot of excitement around the zero-shot performance of T0 and similar, concurrent approaches like FLAN (ai.googleblog.com/2021/10/introd…).

I also want to highlight the data-centric side of the T0 work.

Read 9 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Stephen Bach

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @stevebach

Stephen Bach

Stephen Bach

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!