Thread Reader
Share this page!
×
Tweet
Share
Email
Shayne Longpre
Follow @ShayneRedford
PhD @MIT @medialab. @Google Brain. Ex: @apple ML, @stanfordnlp CS. 🇨🇦 Interests: AI/ML/NLP, Social Science, Online Governance.
2 added to My Authors
Add to My Authors
Feb 1
•
11 tweets
•
7 min read
✨New Paper✨What’s the best completely public competitor to
#ChatGPT
?
Flan-T5 beats all public models we tested:
Flan-T5 3B ▶️ T0++ 3B ▶️ OPT-IML 175B ▶️ GLM-130B ▶️ Flan 2021 3B ▶️ NIv2 3B
We release the
@GoogleAI
🌟Flan Collection🌟data + methods for Instruction Tuning!
1/
The 🌟Flan Collection🌟 (1st used in Flan-PaLM
bit.ly/3Zu7bU2
):
➕ Merges Flan 2021, P3, NIv2, CoT instruction-datasets into 1800+ dataset collection
➕ Data augmentations and mixing strategies
➕ 100s new templates
2/
Oct 6, 2022
•
13 tweets
•
7 min read
📢 A 🧵 on the Trends in NLP Datasets.
What’s changed since SQuAD was all the rage in 2016? A: A LOT. 🔭
1.
Generic ➡️ Niche Tasks
2.
Task-specific Training+Eval ➡️ Eval Only
3.
Dataset ➡️ Benchmark ➡️ Massive Collections
4.
Datasets ➡️ Diagnostics
1/
What started as a trickle became an explosion of NLP datasets over the last few years.
@sebastian
ruder used to track all NLP sets on his website:
nlpprogress.com
. It’s no longer possible to keep up-to-date.
2/
Jun 14, 2022
•
16 tweets
•
9 min read
📢 A 🧵on the future of NLP model inputs.
What are the options and where are we going? 🔭
1.
Task-specific finetuning (FT)
2.
Zero-shot prompting
3.
Few-shot prompting
4.
Chain of thought (CoT)
5.
Parameter-efficient finetuning (PEFT)
6.
Dialog
[1/]
🌟Task-specific finetuning 🌟
The traditional way to prepare NLP models for deployment, it usually obtains the best performance for a specific task, but:
(a) it requires many training examples
(b) it (often) specializes a model for ONE task and ONE data input format ONLY
[2/]
May 28, 2022
•
6 tweets
•
6 min read
Sharing my *rough* slides from a
@CCCatMIT
February reading group.
Covers "NLP Training Trends for Large Language Models" (LLM) and a survey of 4 new interesting papers: FLAN, T0, ExT5, MetaICL!
📚:
bit.ly/3a3SxOj
[1/6]
1st paper we discuss multi-task fine-tuning in FLAN by
@_jasonwei
,
@MaartenBosma
, et al.
TLDR: Multi-task instruction tuning a 137B model on dozens of tasks vastly improves zero/few-shot learning
📜:
arxiv.org/abs/2109.01652
[2/6]