It allows you to interact with your apps (Search, Compile, Calculator, WolframAlpha, Calendar, Zapier) via text, but how does it work?
I'm going to explain 3 techniques — ReAct, Toolformer, LaMDA — and what GPT-4 does.
🧵
1/10
ReAct
Uses some clever few-shot prompt engineering to guide the LLM into dissecting your prompt / question into 3 parts:
—Thought: Introspect on what I need to do.
—Action: Communicate with a tool
—Observation: Reason about the response
And repeating until it's finished.
2/10
ReAct
Facts
—Available today in LangChain!
—Requires more powerful models (100B+)
—Finetuning > prompting, at least for ~50B param models
—API must be text-based (Zapier NLA, Calculator, Search)
3/10
Toolformer
Fine-tunes a LLM on a dataset that annotates text with API calls. Dataset is generated from an LLM, discarding samples that, after execution, isn't useful.
The answer to your question is fed in to the fine-tuned LLM and the annotated API calls execute.
4/10
Toolformer
Facts
—1k-25k examples per API
—Sensitive to wording
—Can work on weaker model (GPT-J 6B)
—API must be text-based
5/10
LaMDA
A separate "research" LLM is fine-tuned to generate an API call from the "base" LLM output, iteratively until the research model thinks the answer is ready.
6/10
LaMDA
Facts
—Seems quite similar to Toolformer
—Launched in Bard! (that's why its good at math and factual accuracy)
—API must be text-based
—Model size requirements not clear
7/10
GPT-4's Plugin approach
—Likely prompt-based
—Can work on APIs that aren't just text->text
—Relies on a) the OpenAPI spec's natural language description of params b) enabling only specific APIs at a time
—BingChat is a reduced version of plugins with only 1 tool: Bing
8/10
Speculating a little:
—Can you train a small model per-plugin once to convert an API spec into a text -> text interface for increased accuracy?
—If multiple plugins can resolve a query, how do you choose?
—For N turned on plugins, how does performance change with higher N?
9/10
ChatGPT w/ Plugins is becoming everything all the voice assistant products hoped to be. I wonder if in the future it can expand beyond APIs to perform actions on apps purely based on pixels!
"Make up an excuse to cancel all the people I have a meetings with today!"
It's going to be fascinating to see how the unit economics of Ads in language models will unfold and affect search advertising.
1/3
Given
—the click through rates / conversions on these ads aren't comparable to search
—LLM-powered chat is eating into search as a information retrieval medium
it could radically reduce the size of the $100B+ search advertising industry, unless..
2/3
—As early demos of ChatGPT Plugins show, you can unify any internet purchase flow into text and greatly reduce barrier to buy
—
3/3
I asked a text AI to tell me a prompt for a drawing AI to draw and here are the BEST results!
GPT4 x MidJourneyV5
Pixar Narendra Modi —
1/8
Bertrand Russell, erudite, analytical, illuminated, by the golden hues of a radiant intellect, within a serene temple of wisdom, surrounded by towering stacks of ancient tomes, amidst a symphony of fluttering pages, under the watchful eyes of the great philosophers
2/8
Ravi Shankar, transcendent, melodic, enigmatic, playing sitar atop a crystalline mountain peak, orchestrating a celestial symphony, surrounded by clouds swirling in rhythm
EB-2 final action date for Indians has retrogressed to Jan 1 2011 in the latest Visa Bulletin: HUGE setback!
If you
—came to the US in 04 for an MS in 06
—got an H-1B in 08
—applied for EB2 in 10
Congrats! you can get your green card after being here for ~20yrs, at age 41.
1/4
Framed differently, a high-skilled Indian immigrant in early 00s will likely naturalize in the US at age 50 if they followed the near-perfect legal path.
If you came to the US later than that, it could take up to is 150yrs — you'll never get it.
2/4
Imagine paying over a $1m in taxes over 20+yrs and contributing tremendous amounts of economic value entrepreneurially and technologically, for nothing in return!
The US stays ahead of China predominantly because of its technical innovation, built on the back of immigrants!
3/4
Meritocrats critics of affirmative action, legacy, and SAT/ACT tests at US undergrad colleges are missing one key detail.
TRULY meritocratic admissions wouldn't just increase # of Asians, it'd obliterate the 10-15% quota for internationals.
They'd be ~30% of the class!
1/6
MIT is one of the few unis that report intl admission stats.
For US citizens, there were 24,165 applicants and 1,201 admits—a ~5% acceptance rate.
For internationals, there are 9,602 applicants and 136 admits—a ~1.4% acceptance rate.
It's 3.5x harder!
2/6
Granted,
- MIT disproportionately attracts intl students.
- Some argue that the avg intl student quality is worse so the lower rate is justified, but anecdotal evidence is contrary. At MIT, it's known that they're better on avg since they're mostly Olympiad medalists.
3/6
Today, Google launched its own large language model Bard, which is based on their LaMDA model!
LaMDA [Feb 2022] is a 137B param model trained on 2.81T tokens on 1024 v3 TPUs for 57.7 days. At $3.22/hr (v4 price), that costs ~$4.5M.
🧵 Thread
1/5
Safety
LaMDA's approach for fine-tuning is different than GPT's RLHF. It samples 16 candidates and picks the top ranked one.
It trains its own model to predict scores for candidates based on labeled data on attributes:
—Sensibleness
—Specificity
—Interestingness
—Safety
2/5
Accuracy
Uses both retrieval-augmented generation and Toolformer techniques. It maintains a toolset (Search, Calculator...) and performs a "Research" task of predicting a toolset and a query for a response.
It loops over this response and research phase up to 4 times.
3/5
Huge hurricane in San Francisco in the last hour!!
- A huge truck to overturn on the Bay Bridge
- Broken glass falling out of the Salesforce Tower
- Fallen trees obstructing traffic
- Power outages across the city
- Huge waves overflowing Embarcadero