Tweet

Stephen Bach

Apr 25 • 9 tweets • 5 min read

If you’re at #ICLR2022, hope you’ll check out our spotlighted poster: “Multitask Prompted Training Enables Zero-Shot Task Generalization.” arxiv.org/abs/2110.08207

Poster session 5, Tue 1:30-3:30 ET

@BigscienceW

This work was a big undertaking from many at the @BigscienceW Workshop, particularly @SanhEstPasMoi, @albertwebson, @colinraffel, and @srush_nlp. It’s been awesome to see all the people already using and building on the T0 family of models for zero-shot learning.

There’s rightly been a lot of excitement around the zero-shot performance of T0 and similar, concurrent approaches like FLAN (ai.googleblog.com/2021/10/introd…).

I also want to highlight the data-centric side of the T0 work.

The hypothesis behind T0 was that we could induce better zero-shot behavior by fine-tuning language models on examples of prompted tasks. But what prompts should we use? Our answer to this question was to iteratively build and refine a collection of over 2000 prompt templates.

We focused on trying to encourage robustness to prompt formulations. Our team developed multiple prompts for 170+ datasets. When applied to the corresponding datasets, this approach can create 100 million+ prompted examples, a subset of which were used to train T0.

Experiments in the paper show that training on multiple prompted formats for the same dataset indeed increased robustness.

The takeaway: engineering the training data for T0 was a key part of getting it to work. Pilot experiments surfaced issues with our preliminary prompts. Along the way we developed guidelines to curate a diverse set of quality prompts.

@Swarooprm7

T0 and PromptSource are just pieces of this emerging trend. For example, check out the awesome work on Natural Instructions led by @Swarooprm7, @yizhongwyz, and @DanielKhashabi. v1 was a precursor to PromptSource, and their new v2 is a massive set of detailed task instructions.

@srush_nlp

See also @srush_nlp’s great perspective:

https://twitter.com/srush_nlp/status/1450539883081179139

Access T0++ here: huggingface.co/bigscience/T0pp
And PromptSource: github.com/bigscience-wor…

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Stephen Bach

People who liked this thread also liked...

Try unrolling a thread yourself!

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?