A list of predictions for 2023 for the field of LLMs🧵
There will be an opensource Chinchilla-style LLM released this year at the level of text-davinci-*. Maybe not from the ones we expect🤔This will obliterate ChatGPT usage and enable various types of fine-tuning / soft-prompting and cost/speed improvements.
(as per above) ChatGPT will remain free until it dies out. Replaced by a paid-for equivalent based on (a) better model(s). The balance of the force will be restored at this stage. So it's not a so silly strategy to build a product for free on ChatGPT 🤔... dangerous game still.
AI assistants will pop-up in every existing product silo. Existing AI-first company defensibility will be battle-tested, big time.
AGI does not pop up yet (even if one of the big labs have an internal LLM AlphaZero moment in 2023). Investor craziness dies out, short-context LLM assistants (as per above #1 #3) are heavily commoditized. Still 1-2 massive companies start to emerge in the B2B space #product
As users, we stop "experiencing" the magic of scale as larger and larger models are released (after all it's a power law).
Context-size becomes the clear limiting factor to many use-cases. OpenAI releases a massive context-size model in 2023 and create a moat for 6 months, time for the opensource model to catch-up and even prevail, likely with a memorizing transformer style approach.
"Memorizing transformer style" or similar approach become predominant as they are always up-to-date and infinite in size. They enable a new wave of products that can operate on entire companies data or entire codebases, or entire chat history.
By the end of 2023 at least one state-actor start a program to compete with OpenAI, Meta AI and Google/DeepMind.
Oh and we finally get good papers on "science of fine-tuning", aka, scaling laws on fine-tuning based on pre-training / fine-tuning compute and pre-training / fine-tuning data size
• • •
Missing some Tweet in this thread? You can try to
force a refresh
For the past couple weeks I've been experimenting with a new way to interact with LLMs: a GPT-based assistant that has access to my browser tabs content.
It lets you submit queries to a `text-davinci-003` prompted assistant, but doing so, lets you search and select some of your tabs to inject their content in the context of the assistant
You can create a bio from a linkedin profile, reply to an email based on knowledge base content, summarize a slack thread,...
Probably a lot more use-cases. You can also interact with the assistant to fix / iterate on the generated content.
I started by crafting few-shot examples to teach the model to generate few-shot examples for a new tool "instruction". Here's the dataset: dust.tt/spolu/a/b39f8e…
Interestingly text-davinci-002 was not too happy about my approach, failing to follow the structure, but code-davinci-002 just breezed through it.
"A game platform where you train agents and make them compete"
Tamagoshi meets Fantasy Football meets Competitive gaming. I checked it's possible to train a decent pong agent (in browser) in under 30 games and it's fun to watch compete
There is a lot of approaches in recent AI research (CNN obv, HER, VAE, or more recently world models, ...) that definitely feel like “hypothetically something close to how our brains work”
So much that it begs the question of whether we should, instead of solely attempting to replicate our brain functions, maybe try to interface with it?
Or we already have an interface for that: our computers (screen, keyboard, camera, ...). An interface with which we already have an almost symbiotic, or at least extremely intuitive relationship; hello vim / alfred / ...