Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Shayne Longpre

@ShayneRedford

May 24 • 10 tweets • 10 min read Twitter logo

Read on Twitter

Scrolly

@CCCatMIT

This semester my @CCCatMIT co-instructors and I taught #MIT's first post-#ChatGPT Generative AI course, covering:

➡️Uses and new abilities
➡️LM Evaluation
➡️AI-mediated communication
➡️Societal challenges

📜 Syllabus + reading list 📚: ai4comm.media.mit.edu

1/

It was a 🎢wild journey to teach in the midst of GPT-4 + Bard launches, moratorium letters, and raging online controversies every d*mn day.

We're excited to release our (and our students') learnings, slides, and the talks from our guest speakers.

Stay tuned!

2/

@RishiBommasani

Over the next few days we'll post talks/talk summaries from:

➡️ @RishiBommasani guest lecture on Holistic Evaluation of Language Models

📜: crfm.stanford.edu/helm/latest/

3/

@_jasonwei

➡️ @_jasonwei on LLM Emergent Abilities as well as a general intro to LLMs

📜: ai.googleblog.com/2022/11/charac…

4/

@bakkermichiel

➡️ @bakkermichiel on "Fine-tuning language models to find agreement among humans with diverse preferences"

📜: arxiv.org/pdf/2211.15006…

5/

@MinaLee__

➡️ @MinaLee__ on "Designing and Evaluating Language Models for Human Interaction"

📜: arxiv.org/abs/2212.09746 and arxiv.org/abs/2201.06796

6/

@informor

➡️ @informor on "My AI must have been broken": Understanding our Future of AI-Mediated Communication

📜: arxiv.org/abs/2206.07271 and dl.acm.org/doi/10.1145/32…

7/

@johnjhorton

➡️ @johnjhorton on "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?"

📜: arxiv.org/abs/2301.07543

8/

@Schropes

As well as a panel on a variety of topics (organized by @Schropes) with several speakers: @_ziv_e @mattgroh @bcsaldias @trudypainter and our own instructor @hjian42 !

9/

@Schropes

This course was designed and taught with my awesome fellow student co-instructors @Schropes @jad_kabbara @hjian42 @suyashfulay @dougb

🧵/

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @ShayneRedford

Shayne Longpre

@ShayneRedford

May 22

#NewPaperAlert When and where does pretraining (PT) data matter?

We conduct the largest published PT data study, varying:
1⃣ Corpus age
2⃣ Quality/toxicity filters
3⃣ Domain composition

We have several recs for model creators…
📜: bit.ly/3WxsxyY

1/ 🧵

First, PT data selection is mired in mysticism.

1⃣ Documentation Debt: #PALM2 & #GPT4 don't document their data
2⃣ PT is expensive ➡️ experiments are sparse
3⃣ So public data choices are largely guided by ⚡️intuition, rumors, and partial info⚡️

2/

PT is the foundation of data-centric and modern LMs. This research was expensive but important to shed light on open questions in training data design.

Here are our main findings:

3/

Read 17 tweets

Shayne Longpre

@ShayneRedford

Mar 28

@OpenAI

What dates📅 can @OpenAI, @AnthropicAI, @CohereAI models reliably answer questions for?🔭

I binary-search through "future" Wiki events to find out. Results ❌🟰❌documentation:

#GPT4 ➡️~Dec 19 ('21)
#ChatGPT ➡️~Oct 24
Claude v1.2➡️~Oct 10
Cohere XL Nightly➡️~Apr 24 ('22)

1/🧵

GPT4 says it is trained up to Sept 2021.

I found it correctly answers unknowable events in Oct, Nov, and even Dec 11th & 19th.

In late Dec it begins to abstain.

2/

Interestingly, GPT 3.5 "Default" answers correctly only until ~Oct 24, 2021, but GPT 3.5 "Legacy" answers correctly until ~Oct 31, 2021 then begins hallucinating false answers or abstaining in Nov.

Perhaps this is due to finetuning rather than pretraining data?

3/

Read 7 tweets

Shayne Longpre

@ShayneRedford

Feb 27

@OpenAI

🔭 A 🧵 on @OpenAI LLM "Alignment" (e.g. #ChatGPT)

Q: How does this differ from publicly available "Instruction Tuning" (IT)?

A: Proprietary Alignment is actually 3 separate components:

1⃣ Instruction tuning
2⃣ ➕ Open-ended generation/creative prompts
3⃣ ➕ Human feedback

1/

@GoogleAI

Component 1⃣:

Instruction Tuning, in its simplest form, teaches the model to follow/answer instructions, instead of generating plausible continuations.

E.g. see @GoogleAI's Flan Collection: arxiv.org/abs/2301.13688

2/

Instruction Tuning public collections are made of 95%+:
➡️ academic,
➡️ short-answer,
➡️ traditional,
NLP tasks. This is a limitation.

3/

Read 17 tweets

Shayne Longpre

@ShayneRedford

Feb 1

@GoogleAI

✨New Paper✨What’s the best completely public competitor to #ChatGPT?

Flan-T5 beats all public models we tested:
Flan-T5 3B ▶️ T0++ 3B ▶️ OPT-IML 175B ▶️ GLM-130B ▶️ Flan 2021 3B ▶️ NIv2 3B

We release the @GoogleAI 🌟Flan Collection🌟data + methods for Instruction Tuning!

1/

The 🌟Flan Collection🌟 (1st used in Flan-PaLM bit.ly/3Zu7bU2):

➕ Merges Flan 2021, P3, NIv2, CoT instruction-datasets into 1800+ dataset collection
➕ Data augmentations and mixing strategies
➕ 100s new templates

2/

This yields the best performing instruction tuning collection that has been compiled and released into one repo.

See our survey Figure of the prior works we built on to produce this compilation.

3/

Read 11 tweets

Shayne Longpre

@ShayneRedford

Oct 6, 2022

📢 A 🧵 on the Trends in NLP Datasets.

What’s changed since SQuAD was all the rage in 2016? A: A LOT. 🔭

1. Generic ➡️ Niche Tasks
2. Task-specific Training+Eval ➡️ Eval Only
3. Dataset ➡️ Benchmark ➡️ Massive Collections
4. Datasets ➡️ Diagnostics

1/

@sebastian

What started as a trickle became an explosion of NLP datasets over the last few years.

@sebastian ruder used to track all NLP sets on his website: nlpprogress.com. It’s no longer possible to keep up-to-date.

2/

🌟 Trend 1 🌟 Generic dataset are replaced with more niche datasets.

⏳ Before: datasets released for general tasks.

⌛️ Now: We see tasks targeting hyper-specific abilities.

Exs:

3/

Read 13 tweets

Shayne Longpre

@ShayneRedford

Jun 14, 2022

📢 A 🧵on the future of NLP model inputs.

What are the options and where are we going? 🔭

1. Task-specific finetuning (FT)
2. Zero-shot prompting
3. Few-shot prompting
4. Chain of thought (CoT)
5. Parameter-efficient finetuning (PEFT)
6. Dialog

[1/]

🌟Task-specific finetuning 🌟

The traditional way to prepare NLP models for deployment, it usually obtains the best performance for a specific task, but:

(a) it requires many training examples
(b) it (often) specializes a model for ONE task and ONE data input format ONLY

[2/]

Because large language models (LLMs) can be:

(a) v expensive to train, and
(b) have emergent capabilities to interpret a NEW task from only an instruction

researchers are experimenting with new strategies to get model predictions…

[3/]

Read 16 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Shayne Longpre

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @ShayneRedford

Shayne Longpre

Shayne Longpre

Shayne Longpre

Shayne Longpre

Shayne Longpre

Shayne Longpre

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!