will depue Profile picture
Sep 12 4 tweets 3 min read Read on X
Some reflection on what today's reasoning launch really means:

New Paradigm
I really hope people understand that this is a new paradigm: don't expect the same pace, schedule, or dynamics of pre-training era.
I believe the rate of improvement on evals with our reasoning models has been the fastest in OpenAI history.
It's going to be a wild year.

Generalization across Domain
o1 isn't just a strong math, coding, problem solving, etc. model but also the best model I've ever used for answering nuanced questions, teaching me new things, giving medical advice, or solving esoteric problems.
This shouldn't be taken for granted!

Safety by Reasoning
The fact that our reasoning models also improve on safety behavior and safety reasoning is very much non-trivial.
For years (a decade?) the boogeyman of the AI world was reinforcement learning agents which were incredibly adept at game playing but completely incapable of reasoning or understanding human values!
This is a strong point of evidence against this.

Scaling inference-time compute can compete with scaling training compute!
The fact that o1-mini is better than o1 on some evals is very very remarkable. The implications of this I'll leave as an exercise for the reader.

Multimodal Reasoning
It's kind of crazy that reasoning improves on multimodal evals as well! See MMMU and MathVista: these aren't small improvements.Image
Image
Image
Image
To be clear I'm not one of the contributors to the o1 project: this has been the absolutely incredible work of the reasoning & related teams.
The rate of progress has just been faster than anything I've ever seen: it's absurd how fast the team has climbed the scaling OOMs just after discovering this paradigm.
Less seriously now:
I do want to also give a word of caution to the schizos, the hypemen, the fans and the haters:
This is a new paradigm. As with all nascent projects will be holes, bugs, issues to fix. Don't expect everything to be perfect instantly!
But you should take the rate of progress, the fact that we're solving problems that seemed miles away in the pretraining scaling laws, the fact that we now have visibility into solving many of the things which people have said LLMs could never do.
There's lots of quirks and benefits of the pretraining paradigm that might not exist in the reasoning paradigm, and vice versa. As a random example, I do believe there will be more examples of inverse scaling here than in the pre-training world (in which there were surprisingly few).
Onwards!
This is something to remember this is not gpt-o1, it is o1, a new thing.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with will depue

will depue Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @willdepue

May 13
i think people are misunderstanding gpt-4o. it isn't a text model with a voice or image attachment. it's a natively multimodal token in, multimodal token out model.
you want it to talk fast? just prompt it to. need to translate into whale noises? just use few shot examples.
every trick in the book that you've been using for text also works for audio in, audio out, image perception, video perception, and image generation.
for example, you can do character consistent image generation just by conditioning on previous images. (see the blog post for more)

Starting from this image prompt:

This is Sally, a mail delivery person: Sally is standing facing the camera with a smile on her face.

Now Sally is being chased by a dog. Sally is running down the sidewalk and as a golden retriever is chasing her.

Uh oh, Sally has tripped!
Sally has tripped over a branch that was blocking the sidewalk, and she is trying to stand up. The dog is still chasing her in the background.Image
Image
Image
Image
Read 7 tweets
Mar 25
announcing... starlinkmap dot org
real-time map of every starlink satellite. tracks upcoming launches, other constellations, orbital updates, etc.
finally launching this after a while! more details below.
starlink is, imo, one of the most exciting technologies of our generation.
today, only 65% of the world has access to the internet at all (and far fewer have high-speed internet).
with direct-to-cell coming, soon every device, anywhere on Earth, will be connected together. Image
there's lots of stats on the website. here are some of the best:
- over 5,600 starlinks orbiting right now. right under 6000 ever launched.
- as of march: ~2.6 million starlink customers worldwide
- in the last year, there's been a starlink launch on average every 5.2 days!
Image
Image
Read 5 tweets
Sep 23, 2023
I ask DALLE-3 to generate a Pepe but each time I tell it to make it "more rare." Image
"make it more rare" Image
"even rarer" Image
Read 26 tweets
Sep 20, 2023
DALLE-3 is the best product I've seen since GPT-4, super easy to just get sucked in for hours generating images. No need for prompting since GPT-4 does it for you.
Let me know if you have requests for prompts below. Here are some examples of what it can do:


Image
Image
Image
Image
It's shockingly good at styles that require consistent patterning like Pixel Art, mosaics, or dot matrices.

Image
Image
Image
It's quite good at people... and hands (at last).


Image
Image
Image
Image
Read 15 tweets
Jun 26, 2023
FIGMA-OS: The first Turing-complete Figma file.
SPECS: 8-bit architecture, 512 bits of RAM, 16 bytes of Program Memory, MISC instruction set of 16 OPCODES, 10HZ clock speed, 4 fast access registers, binary-tree RAM/ROM memory.
MOTIVATIONS: For the meme.
HOW: Explained below.
FIGMA-OS has every feature that any modern, enterprising technologist could possibly need:
► A stunning and detailed user manual.
► Useful pre-installed programs like: Fibonacci Numbers.
► An award-winning graphical user interface.



FIGMA-OS has been generously open-sourced to serve all your computing needs, live on the Figma Community today.
▼ Try our demo ▼
▼ Duplicate FIGMA-OS and see it for yourself ▼
figma.com/community/file…
Read 15 tweets
Jun 22, 2023
do you have any hobbies?
yeah making computers out of things that shouldn't be computers. watch me be the first to bring turing completeness to figma
(edit going to build this tonight so scroll for my live tweeting of a computer) https://t.co/a07l9Ib0Qntwitter.com/i/web/status/1…
ok simple clock working seems promising. add/sub/mult/div already implemented for numbers already by figma, seems like there might be more ops for other types which is great
ok time to test limits and max out these variables. numbers represented as signed 32 bit ints, and will overflow to min int. interesting
Read 22 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(