Tweet

Jim (Linxi) Fan

Dec 8 • 21 tweets • 11 min read

Why does ChatGPT work so well? Is it “just scaling up GPT-3” under the hood? In this 🧵, let’s discuss the “Instruct” paradigm, its deep technical insights, and a big implication: “prompt engineering” as we know it may likely disappear soon:👇

The original GPT-3 was trained by a minimalistic objective: predict the next word on a massive text corpus. Many abilities magically emerge, such as reasoning, coding, translation. You can even do “few-shot learning”: define new tasks by providing I/O examples in context. 1/

It’s not at all obvious why simply predicting the next word can give us such abilities. One intuitive explanation is to imagine a detective story. Suppose the model needs to fill in the last blank: “the murderer is ___”, then it has to do deep reasoning to answer correctly. 2/

But this is not enough. In practice, we have to coax GPT-3 to autocomplete what we desire by carefully curating the examples, wording, and structure. This is exactly “prompt engineering”, where users have to practice the awkward and sometimes nonsensical vernacular of LLMs. 3/

Prompt engineering is a BUG🐞, not a feature! It’s caused by the fundamental *misalignment* between the next-word objective and the actual user intent in real applications. Example: you want GPT-3 to “Explain the moon landing to a 6yo”. It replies like a drunk parrot🦜: 4/

Prompt engineering is even worse in DALLE2 and Stable Diffusion. Just go to lexica.art and see how insane some prompts are. My favorite is the “parentheses trick” - adding (((…))) sometimes gives you better images 😅. It’s both hilarious and embarassing. 5/

ChatGPT and the base model InstructGPT address the plague in an elegant way. The key observation is that alignment is very hard to be captured by in-the-wild data. Humans must be in the loop to help tutor GPT, and GPT will be able to ask better questions as it improves. 6/

There are 3 steps. The first is very straightforward: just collect a dataset of human-written answers to prompts that users submit, and finetune GPT by supervised learning. It’s easiest but also the most costly: it could be slow and painful for humans to write long responses. 7/

Step 2 is much more interesting. GPT is asked to *propose* a few different answers, and all a human annotator needs to do is *ranking* the responses from most desirable to least. Using these labels, we can train a reward model that captures human *preferences*. 8/

https://twitter.com/drjimfan/status/1595459975265939462

In reinforcement learning (RL), the reward function is typically hardcoded, such as the game score in Atari games. ChatGPT’s data-driven reward model is a powerful idea. Another example is our recent MineDojo work that learns reward from tons of Minecraft YouTube videos: 9/

https://twitter.com/drjimfan/status/1595459975265939462

Step 3: treat GPT as a policy and optimize it by RL against the learned reward. PPO is chosen as a simple and effective training algorithm. Now that GPT is better aligned, we can rinse and repeat step 2-3 to improve it continously. It’s like CI for LLM! 10/

This is the “Instruct” paradigm - a super effective way to do alignment, as evident in ChatGPT’s mindblowing demos. The RL part also reminds me of the famous P=NP (or ≠) problem: it tends to be much easier to verify a solution than actually solving the problem from scratch. 11/

Similarly, humans can quickly assess the quality of GPT’s output, but it’s much harder and cognitively taxing to write out a full solution. InstructGPT exploits this fact to lower the manual labeling cost significantly, making it practical to scale up the model CI pipeline. 12/

Another interesting connection is that the Instruct training looks a lot like GANs. Here ChatGPT is a generator and reward model (RM) is a discriminator. ChatGPT tries to fool RM, while RM learns to detect alien with human help. The game converges when RM can no longer tell. 13/

Model alignment with user intent is also making its way to image generation! There are some preliminary works, such as arxiv.org/abs/2211.09800. Given the explosive AI progress, how long will it take to have an Instruct- or Chat-DALLE that feels like talking to a real artist? 14/

So folks, enjoy prompt engineering while it lasts! It’s an unfortunate historical artifact - a bit like alchemy🧪, neither art nor science. Soon it will just be “prompt writing” - my grandma can get it right on her first try. No more magic incantations to coerce the model. 15/

https://twitter.com/drjimfan/status/1599854164204736512

Of course, ChatGPT is not perfect enough to completely eliminate prompt engineering for now, but it is an unstoppable force. Meanwhile, the model has other serious syndromes: hallucination & habitual BS. I covered this in another thread: 16/

https://twitter.com/drjimfan/status/1599854164204736512

@carperai

There are ongoing open-source efforts for the Instruct paradigm! To name a few:
👉 trlx @carperai github.com/CarperAI/trlx. Carper AI is an org from @StabilityAI
👉 RL4LM @rajammanabrolu @allen_ai rl4lms.apps.allenai.org
I’m so glad to have met the above authors at NeurIPS! 17/

@johnschulman2

Further reading: reward model also has scaling laws: arxiv.org/abs/2210.10760! Also the RM is only an imperfect proxy (unlike Atari), so it’s a bad idea to over-optimize. This paper is from @johnschulman2, inventor of PPO. Super interesting work but went under the radar. 18/

@goodside

There are also other artifacts caused by the misalignment problem, such as prompt hacking or “injection”. I actually like this one because it allows us to bypass OpenAI’s prompt prefix and fully unleash the model 😆. See @goodside’s cool findings: 19/

https://twitter.com/goodside/status/1598253337400717313

Thanks for reading! Welcome to follow me for more deep dives in the latest AI tech 🙌.
References:
👉 openai.com/blog/instructi…
👉 InstructGPT paper: arxiv.org/abs/2203.02155
👉 openai.com/blog/chatgpt/
👉 Beautiful illustrations: jalammar.github.io/how-gpt3-works…
END/🧵

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @DrJimFan

Jim (Linxi) Fan

@DrJimFan

Dec 5

ChatGPT is the closest model we have that looks like AGI, but still suffers from hallucination and tends to write plausible-looking bs with uncanny confidence. It could be misleading and even dangerous. Can we fix it? There is a simple & intuitive remedy:🧵

Let’s first identify the root cause. Hallucinating facts is an unfortunate side effect of trying to *memorize* everything in the 100B+ model parameters of ChatGPT. Even humans have unreliable and made-up memories, so of course ChatGPT is no exception. 2/

We are no stranger to false facts ourselves, which surged during the pandemic and elections. Hiring “fact checkers” was the solution we came up to reduce human-generated bs. It is actually similar to the “retrieval” mechanism in LLM terminology. 3/

Read 11 tweets

Jim (Linxi) Fan

@DrJimFan

Nov 27

Excited to go to NeurIPS conference tomorrow! It's an annual gala for AI. Many revolutionary ideas debuted here, like AlexNet, Transformer, & GPT-3.
I read *all 15* Outstanding Papers and can’t wait to share my thoughts with you all. Here's your front row seat to the festival:🧵

For each paper, I’ll give a TLDR and a note on why I think it’s significant. I may also link any interesting blogs and websites that dive in with greater depth. Original authors are welcome to chime in and expand the discussion or correct any mistakes! Tweet index is by paper.

@DeepMind

Training Compute-Optimal Large Language Models. Hoffmann et al, @DeepMind. TLDR: introduces a new 70B LM called "Chinchilla”🐭 that outperforms much bigger LMs (GPT-3, Gopher). To be compute-optimal, model size and training data must be scaled equally. 1.1/

Read 49 tweets

Jim (Linxi) Fan

@DrJimFan

Nov 23

GPT3 is powerful but blind. The future of Foundation Models will be embodied agents that proactively take actions, endlessly explore the world, and continuously self-improve. What does it take? In our NeurIPS Outstanding Paper “MineDojo”, we provide a blueprint for this future:🧵

We argue that there are 3 main ingredients for generalist agents to emerge. First, an open-ended environment that allows an unlimited variety of tasks and goals. Earth is one example, as it is rich enough to forge an ever-expanding tree of life forms and behaviors. What else? 2/

Second, a large-scale knowledge base that teaches an AI not only *how* to do things, but also *what* are the useful things to do. GPT-3 learns from web text alone, but can we give our agent much richer data, such as video walkthroughs, multimedia tutorials, and free-form wiki? 3/

Read 21 tweets

Jim (Linxi) Fan

@DrJimFan

Nov 16

@paperswithcode

Today a 120B model called “Galactica” is open-sourced by @paperswithcode. It’s capable of writing math notations, citations, code, chemical formula, DNA, etc. Here’s why I think Galactica is a huge milestone in open foundation models, scientific automation, and responsible AI: 🧵

Large language models have personalities. They are not shaped by the architecture, but by the training data. Models like GPT-3 and OPT are trained on texts scraped from the internet at large, which unfortunately contains lots of irrelevant, misinformed, or toxic contents. 2/🧵

In contrast, scientific texts like academic papers are mostly immune from these data plagues. They contain analytical text with a neutral tone, knowledge backed by evidence, and are written by people who wish to inform rather than inflame. A dataset born in the ivory tower. 3/🧵

Read 8 tweets

Jim (Linxi) Fan

@DrJimFan

Oct 7

We trained a transformer called VIMA that ingests *multimodal* prompt and outputs controls for a robot arm. A single agent is able to solve visual goal, one-shot imitation from video, novel concept grounding, visual constraint, etc. Strong scaling with model capacity and data!🧵

We envision that a generalist robot agent should have an intuitive and expressive interface for task specification, but text alone is not enough. We introduce a novel multimodal prompting framework that converts a wide spectrum of robotic tasks into one sequence modeling problem.

Our VIMA model (reads “v-eye-ma”) consists of a pre-trained T5 to encode multimodal prompts, and a transformer decoder to predict robot arm commands autoregressively. The decoder has alternating self- and cross-attention layers conditioned on the prompt.

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Jim (Linxi) Fan

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @DrJimFan

Jim (Linxi) Fan

Jim (Linxi) Fan

Jim (Linxi) Fan

Jim (Linxi) Fan

Jim (Linxi) Fan

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!