Samuel Albanie 🇬🇧's Threads

Jun 8 • 9 tweets • 1 min read

Who are the top public human LLM prompters rn? 🧵

A few of my picks below

(I'm biased obvs, and a lot of talent is prompting in private)

1/9 Murray Shanahan @mpshanahan

consciousness, AI & philosophy

Goes deep

2/9doc.ic.ac.uk/~mpsha/convers…

May 15, 2023 • 25 tweets • 18 min read

Another week, another full bucket of AI news.

Some highlights...

🧵1/25

Language models can explain neurons in language models

- Aims to scale up interpretability to large language models

- Exploits ability of GPT-4 to simulate neurons

by S. Bills, @nickcammarata, @mildseasoning, @HenkTillman, @nabla_theta, @WuTheFWasThat, @janleike

2/25

Mar 31, 2023 • 5 tweets • 3 min read

1/ 🚀🔬 Introducing our groundbreaking research paper: "Large Language Models are Few-shot Publication Scoopers"

We've discovered the secret to achieving personal glory and a lifetime supply of Cheerios
Joint work with
@LiliMomeni and J. F. Henriques

Appears @sigbovik today

2/ 🏃💨 Tired of racing to publish your next high-impact research?

Our revolutionary pip-to-the-post algo. ensures adulatory Wikipedia pages without risking your career on conventional research strategies

Scoop with the insouciance of a seasoned researcher at a dessert buffet🍨

Jan 24, 2023 • 21 tweets • 8 min read

BLOOM.

A large language model trained by researchers from around the world by @BigscienceW.

How did they do it?

Why did they do it?

Let's dive in.

1/21
🧵

Large Languages Models (LLMs) now play a key role in NLP.

But few orgs can afford to train them.

Also:
- most LLMs focus on English
- many are not public

Goals for BLOOM
- release a strong multilingual LLM
- document the development process

2/21

Nov 7, 2022 • 17 tweets • 10 min read

Multitask prompted finetuning (aka instruction finetuning) can boost language model performance.

But how can we make progress beyond English (esp. on languages with limited finetuning data)?

Work by @Muennighoff & others in @BigscienceW studies this in detail.

1/17 🧵

For this study, datasets spanning 46 languages were gathered (collectively referred to as "xP3").

xP3 aims to mimic the distribution of languages found in ROOTS (the dataset used to pretrain BLOOM).

2/17

Oct 28, 2022 • 12 tweets • 6 min read

Finetuning language models on instructions increasingly seems a compute-efficient way to gain performance.

Recent work from @hwchung27, @_jasonwei, @JeffDean, @quocleix & others scales this up to new regimes.

TLDR: Even for big models (540B params), gains are substantial.

1/12

For those who prefer a narrated version:

2/12

Oct 28, 2022 • 6 tweets • 5 min read

How can we reduce the computational cost of training neural networks?

Bo Zhao, Hakan Bilen and collaborators have produced a creative body of work developing a technique known as "dataset condensation".

1/7

Key idea: compress a large dataset into a small set of synthetic images that can train networks to the same accuracy as the original dataset.

Was a pleasure to examine Bo's thesis on this topic work with @driainmurray.

2/7

Share this page!

Enter URL or ID to Unroll