Typical RNNs are like a for loop that can't be vectorized, which hurts parallelization during training.
RWKV cleverly resolves this with a layer that works like an RNN cell when it's run step by step, but can be computed all at once like Transformer attention.
And unlike many other alternatives to Transformers, it gets comparable language modeling performance up to the largest scales tested: 14B params, 300B tokens.
In the "agent" pattern, LLMs are given memory, access to tools, and goals.
@hwchase17, founder of the most popular LLM framework @LangChainAI, shares exciting recent research results and the gnarly challenges facing agents in production.
tl;dr: LLMs unlock new user interaction design patterns based on language user interfaces (LUIs). But the same principles of user-centered design still apply!
Since the inception of computing programmers & designers have dreamed of interfacing with computers via language as naturally as we interface with each other.
Proof-of-concepts for such language user interfaces date back to the 60s and recur repeatedly.
LLMs make LUIs possible.
A paradigm shift in user interfaces makes for a great time to build ambitious applications!
But because language models (and ML in general) come from the math-ier side of engineering, lots of folks are less familiar with the principles that guide user interaction design.
tl;dr Effective prompting requires some intuition about language models, but there's an emerging playbook of general techniques.
First off: What is a "prompt"? What is "prompt engineering"?
The prompt is the text that goes into your language model.
Prompt engineering is the design of that text: how is it formatted, what information is in it, and what "magic words" are included.
So, what are some high-level intuitions for prompting?
First of all, the idea that LMs are "just statistical models of text," while literally true, leads to bad intuition that underestimates what they can do.
Over the last two weeks, we tweeted out twelve papers we love in the world of language modeling, from agent simulation and browser automation to BERTology and artificial cognitive science.
Here they are, collected in a single 🧵 for your convenience.
1/12 - Reynolds and McDonell, 2021. "Prompt Programming for LLMs: Beyond the Few-Shot Paradigm"
The OG Prompt Engineering paper -- formatting ticks, agent sim, and chain-of-thought, before they were cool
Whatever our thoughts on chat _bots_, we enjoyed our chat with @hwchase17 of @LangChainAI on the most recent FSDL Tool Talk!
@charles_irl started us off with an overview of why we need LLM frameworks, then after a demo of how to use LangChain to do Q&A over the LangChain docs we did some live Q&A -- humans only.