Tweet

Matt Shumer

Follow @mattshumer_

31 May, 12 tweets, 4 min read

A simple overview of the state of massive language models like GPT-3.

/thread

Since 2018, each year has brought new models that are typically 10x+ larger than models from the year prior.

We’re at a point where these models are capable enough to perform many tasks. Optimization now becomes just as important as scaling up further.

Techniques like Mixture of Experts, PPLM, distillation, random feature attention are all being actively researched.

These will both optimize costs and reduce compute needs, as well as improve the control developers have over large language models.

The largest models (GPT-3, Turing-NLG, etc.) already have lots of knowledge and capabilities. The question is, how do we more effectively, reliably, and systematically retrieve that knowledge?

As answers to this question become clearer, language models will become more useful.

In this paper from OpenAI: cdn.openai.com/papers/ai_and_…

“We argue that algorithmic progress has an aspect that is both straightforward to measure and interesting: reductions over time
in the compute needed to reach past capabilities.”

We’re seeing algorithmic efficiency doubling every 16 months.

By the end of 2021, it will cost around half of what it cost in early 2020 to train a GPT-3-sized model.

@OthersideAI

Hundreds of products are being built on top of language models, including hyperwrite.ai, @OthersideAI’s AI writing companion.

@OpenAI’s customers are generating billions of words each day with GPT-3.

@Microsoft is even integrating GPT-3 into its Power Apps platform.

@OpenAI

Massive amounts of capital is being invested in this space.

@OpenAI just announced a $100M fund for startups using their API.

@AnthropicAI announced a $124M raise to fund research into large models.

This is just the start. Language is powerful on its own, but when you begin to combine language with other modalities, you get even more powerful and capable models.

Imagine a model that is trained on both text and video. This is coming, and soon.

Multi-modal models.

@OpenAI

If you are interested in following along as these models progress, here are some accounts to follow:

@OpenAI
@MSFTResearch
@GoogleAI
@huggingface
@allen_ai

Drop me a follow for more content like this!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @mattshumer_

Matt Shumer

@mattshumer_

1 Jun

Validating a startup idea doesn’t have to take a ton of effort.

Here are 10 tools you can use to build MVPs in days.

/thread

@bubble

@bubble | bubble.io

Bubble has a visual programming interface that enables anyone to build a complex web app in days.

I use Bubble to quickly get a working MVP to users after I’ve proven out an initial idea.

@AdaloHQ

@AdaloHQ | adalo.com

Adalo is similar to Bubble, but you can also use it to build native mobile applications.

You can make social apps, marketplaces, task apps, and more.

Read 12 tweets

Matt Shumer

@mattshumer_

3 Dec 20

How to GPT-3!

A primer thread on GPT-3 prompt structure:

Working with GPT-3 is just a game of figuring out how to structure text to get the results you want.

Here are some methods that work well.

Some of these methods can be used together. There’s an art to figuring out which methods are best for obtaining the results you want.

You can use zero-shot, one-shot, or few-shot methods, depending on the task. Your goal should typically be to zero-shot or one-shot, as latency and costs will be lower.

Here’s a quick primer:

Read 11 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Matt Shumer

Try unrolling a thread yourself!

More from @mattshumer_

Matt Shumer

Matt Shumer

Did Thread Reader help you today?

Like this author's thread?