Thread by @danlovesproofs on Thread Reader App

RETRO models are a giant capability unlock for LLM tech, and they're shockingly under the radar.

The first ones should come out this year. They might be even more significant than GPT-4.

You can't "teach" current LLMs, the way you'd teach an employee. If they do something bad, there isn't a good way to say "don't do that."

You can include a reminder in every prompt, but that eats up precious context space.

You can fine-tune, but you need hundreds of examples.

That's where RETRO comes in.

RETRO lets you store a gigantic database of facts and pull them into a prompt based on their contextual relevance.

You can update your fact set without retraining your model.

deepmind.com/publications/i…

That's a huge capability for any "agent" use cases. You can train your agent the way you'd train an employee!

E.g.:
- "When a customer brings up X, remember to mention Y"
- "Remember I think Z, when advocating on my behalf"
- "Don't ever say W"

This might be the most important lever for a lot of practical applications.

Once GPT-4 era models come out, LLMs are going to be damn good at answering whatever is in a prompt.

Applications will then be constrained by how complete they can make those prompts...

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll