OK everyone's asking me for my take on the OpenAI stuff, so here it is. I have a strong feeling about what's going on, but no internal info so this is just me talking.
The first point to make is that the Dev Day was (IMO) an absolute embarrassment.
I could barely watch the keynote. It was just another bland corp-speak bunch of product updates.
For those researchers I know that were involved from the beginning, this must have felt nausea-inducing.
The plan was AGI, lifting society to a new level. We got Laundry Buddy.
When OAI was founded I felt like it was gonna but a rough ride. It was created by a bunch of brilliant researchers that I knew and respected, plus some huge names from outside the field: Elon, GDB, and sama, none of who I'd ever come across at any AI/ML conference or meetup.
Everything I'd heard about those 3 was that they were brilliant operators and that they did amazing work. But it felt likely to be a huge culture shock on all sides.
But the company absolutely blossomed nonetheless.
With the release of Codex, however, we had the first culture clash that was beyond saving: those who really believed in the safety mission were horrified that OAI was releasing a powerful LLM that they weren't 100% sure was safe. The company split, and Anthropic was born.
Now OAI accelerated in its new direction. It wasn't open any more, and it decided to pursue profits to fund its non-profit goals.
Nonetheless, the company remained controlled by the non-profit, and therefore by its board.
Suddenly sama, the CEO, was everywhere. Giving keynotes, talking to world leaders, and raising billions of dollars. He's widely regarded as one of the most ambitious and effective operators in the world.
I wondered how his ambition could gel with the legally binding mission.
My guess is that watching the keynote would have made the mismatch between OpenAI's mission and the reality of its current focus impossible to ignore. I'm sure I wasn't the only one that cringed during it.
I think the mismatch between mission and reality was impossible to fix.
Overall, I expect that the OAI board's move will turn out to be a critical enabler of OAI's ability to delivery on its mission.
In the future, aspirational people looking for power and profits will *not* be drawn to the company, and instead it'll hire and retain true believers.
I'm gonna take back my "ngmi" from the day before the sama move.
If you're like me and find it easier to read *code* than *math*, and you have access to @OpenAI GPT 4V (or use @bing or @google Bard), try pasting a image of an equation you wanna understand in there.
If you're like me and find it easier to read math than code, and you have access to @OpenAI GPT 4V, try pasting a image of an equation you wanna understand in there.
I just uploaded a 90 minute tutorial, which is designed to be the one place I point coders at when they ask "hey, tell me everything I need to know about LLMs!"
It starts at the basics: the 3-step pre-training / fine-tuning / classifier ULMFiT approach used in all modern LLMs.
It goes all the way through to fine-tuning your own LLM that converts questions about data into SQL statements to answer the question, using @PyTorch, @huggingface Transformers, and @MetaAI Llama 2.
But before we build our own stuff, I show how to take advantage of @OpenAI's ChatGPT GPT 4 and Advanced Data Analysis, including how I created this useful chart of API prices automatically from the text of OpenAI's web page.
It looks like @johnowhitaker & I may have found something crazy: LLMs can nearly perfectly memorise from just 1-2 examples!
We're written up a post explaining what we've seen, and why we think rapid memorization fits the pattern. Summary 🧵 follows. fast.ai/posts/2023-09-…
Johno & I are teaming up on the @Kaggle LLM Science Exam competition, which “challenges participants to answer difficult science-based questions written by a Large Language Model".
After 3 epochs of fine-tuning an LLM for this problem, we saw this most unusual training loss curve.
We’ve seen similar loss curves before, and they’ve always been due to a bug. For instance, it’s easy to accidentally have the model continue to learn on the validation set.
First you need conda installed (e.g. via anaconda, miniconda, or miniforge). If you don't have it already, just run this script: github.com/fastai/fastset…
Now find out what CUDA version PyTorch expects by going to their website and seeing what the latest "compute platform" version is. At time of writing, it's 12.1 pytorch.org/get-started/lo…