Thread Reader
Share this page!
×
Post
Share
Email
Enter URL or ID to Unroll
×
Unroll Thread
You can paste full URL like: https://x.com/threadreaderapp/status/1644127596119195649
or just the ID like: 1644127596119195649
How to get URL link on X (Twitter) App
On the Twitter thread, click on
or
icon on the bottom
Click again on
or
Share Via icon
Click on
Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at
Twitter Help
Samuel Albanie
@SamuelAlbanie
Researcher @GoogleDeepMind
Subscribe
Save as PDF
May 15, 2023
•
25 tweets
•
18 min read
Another week, another full bucket of AI news.
Some highlights...
🧵1/25
Language models can explain neurons in language models
- Aims to scale up interpretability to large language models
- Exploits ability of GPT-4 to simulate neurons
by S. Bills,
@nickcammarata
,
@mildseasoning
,
@HenkTillman
,
@nabla_theta
,
@WuTheFWasThat
,
@janleike
2/25
Save as PDF
Mar 31, 2023
•
5 tweets
•
3 min read
1/
🚀🔬 Introducing our groundbreaking research paper: "Large Language Models are Few-shot Publication Scoopers"
We've discovered the secret to achieving personal glory and a lifetime supply of Cheerios
Joint work with
@LiliMomeni
and J. F. Henriques
Appears
@sigbovik
today
2/
🏃💨 Tired of racing to publish your next high-impact research?
Our revolutionary pip-to-the-post algo. ensures adulatory Wikipedia pages without risking your career on conventional research strategies
Scoop with the insouciance of a seasoned researcher at a dessert buffet🍨
Save as PDF
Jan 24, 2023
•
21 tweets
•
8 min read
BLOOM.
A large language model trained by researchers from around the world by
@BigscienceW
.
How did they do it?
Why did they do it?
Let's dive in.
1/21
🧵
Large Languages Models (LLMs) now play a key role in NLP.
But few orgs can afford to train them.
Also:
- most LLMs focus on English
- many are not public
Goals for BLOOM
- release a strong multilingual LLM
- document the development process
2/21
Save as PDF
Nov 7, 2022
•
17 tweets
•
10 min read
Multitask prompted finetuning (aka instruction finetuning) can boost language model performance.
But how can we make progress beyond English (esp. on languages with limited finetuning data)?
Work by
@Muennighoff
& others in
@BigscienceW
studies this in detail.
1/17 🧵
For this study, datasets spanning 46 languages were gathered (collectively referred to as "xP3").
xP3 aims to mimic the distribution of languages found in ROOTS (the dataset used to pretrain BLOOM).
2/17
Save as PDF
Oct 28, 2022
•
12 tweets
•
6 min read
Finetuning language models on instructions increasingly seems a compute-efficient way to gain performance.
Recent work from
@hwchung27
,
@_jasonwei
,
@JeffDean
,
@quocleix
& others scales this up to new regimes.
TLDR: Even for big models (540B params), gains are substantial.
1/12
For those who prefer a narrated version:
2/12
Save as PDF
Oct 28, 2022
•
6 tweets
•
5 min read
How can we reduce the computational cost of training neural networks?
Bo Zhao, Hakan Bilen and collaborators have produced a creative body of work developing a technique known as "dataset condensation".
1/7
Key idea: compress a large dataset into a small set of synthetic images that can train networks to the same accuracy as the original dataset.
Was a pleasure to examine Bo's thesis on this topic work with
@driainmurray
.
2/7