ceo at @scale_ai. rational in the fullness of time
3 subscribers
Dec 12, 2024 • 8 tweets • 2 min read
Since ChatGPT dropped in 2022, AI progress has been dramatic.
But it's also been predictable—new models, bigger chip clusters, more chatbots.
Not in 2025.
Here are the three big changes to watch for over the next 12 months 🧵
1/8
#1 Geopolitical Swing States.
The conversation is going to expand from “Who is leading - the US vs. China?” to “which country’s AI is most exportable worldwide?”
AI-curious countries around the world—“geopolitical swing states”—are going to decide which side they go with
2/8
Nov 5, 2024 • 7 tweets • 3 min read
Scale AI is proud to announce Defense Llama 🇺🇸: the LLM purpose-built for American national security.
This is the product of collaboration between @Meta, Scale, and defense experts, and is available now for integration into US defense systems.
Read more below👇
With the National Security Memorandum coming out of the White House recently, it is clear we need to move fast on AI in national security.
From the NSM:
"If the United States Government does not act with responsible speed and in partnership with industry, civil society, and academia to make use of AI capabilities in service of the national security mission — and to ensure the safety, security, and trustworthiness of American AI innovation writ large — it risks losing ground to strategic competitors."
Sep 16, 2024 • 4 tweets • 2 min read
As LLMs get smarter, evals need to get harder.
OpenAI’s o1 has already maxed out most major benchmarks.
Scale is partnering with CAIS to launch Humanity’s Last Exam: the toughest open-source benchmark for LLMs.
We're putting up $500K in prizes for the best questions.
(read on)
We need tough questions from human experts to push AI models to their limits. If you submit one of the best questions, we’ll give you co-authorship and a share of the prize pot.
The top 50 questions will earn $5,000 each, and the next 500 will earn $500 each. All selected questions grant optional co-authorship on the resulting paper.
We're seeking questions that go beyond undergraduate level and aren't easily answerable via quick online searches.
Aug 1, 2024 • 8 tweets • 2 min read
1/Gemini 1.5 Pro 0801 is the new best model (tops LMSYS, SEAL evals incoming)
Key considerations
1—OpenAI, Google, Anthropic, & Meta all right ON the frontier
2—Google has a long-term compute edge w/TPUs
3—Data & post-training becoming key competitive drivers in performance
🧵
2/We've seen 7 major models from top labs in the last 3mo:
May:
- GPT 4o
- Gemini 1.5 Pro
June:
- Claude 3.5 Sonnet
July:
- Llama 3.1
- Mistral Large 2
- GPT-4o Mini
August:
- Gemini 1.5 0801
Each of these models has been incredibly competitive—each world-class in some way.
Jul 25, 2024 • 8 tweets • 3 min read
1/ New paper in Nature shows model collapse as successive model generations models are recursively trained on synthetic data.
This is an important result. While many researchers today view synthetic data as AI philosopher’s stone, there is no free lunch.
Read more 👇
Training on pure synthetic data has no information gain, thus there is little reason the model *should* improve.
Oftentimes when evals go up from “self-distillation”, that might be from some more invisible tradeoff, i.e. mode collapse in exchange for individual eval improvement
Jun 9, 2024 • 9 tweets • 3 min read
1/ one of the biggest questions in AI today is:
since GPT-4 was trained in fall 2022, we've collectively spent ~$100B on NVIDIA GPUs
will the next generation of AI models' capabilities live up to that aggregate investment level?
NVIDIA qtrly datacenter rev, by @Thomas_Woodside 2/ there are 2 schools of thought:
1) compute is the only real bottleneck to AI progress. the more we spend, the closer we get to AGI
2) we are hitting a data wall which will slow progress regardless of how much compute we have
2/ Evaluations are a critical component of the AI ecosystem.
Evals are incentives for researchers, and our evaluations set the goals for how we aim to improve our models.
Trusted 3rd party evals are a missing part of the whole ecosystem, which is why @scale_AI built these.
May 28, 2024 • 9 tweets • 2 min read
1/ Today is the 4th anniversary of the original GPT-3 paper—"Language Models are Few-Shot Learners"
Some reflections on how the last 4 years have played out, and thoughts about the next 4 years
2/ GPT-3 was when it first became clear what the potential of scaling language models was.
The efficacy of GPT-3 took the AI community by surprise for the most part—the capabilities were staggering compared to everything that came before in NLP.
May 16, 2024 • 10 tweets • 2 min read
1/ Some thoughts on the recent OpenAI and Google announcements, and what it indicates about what's next in AI.
Hint: post-training is REALLY important...
THREAD
2/ In many ways, Gemini 1.5 Flash was the gem of Google's announcements. A 1M-context small model with Flash performance is incredible.
OpenAI now has the best large model with GPT-4o, and Google has the best small model with Gemini 1.5 Flash.
The competition is on.
Jan 1, 2024 • 5 tweets • 1 min read
I'm posting some of my learnings from 2023, AI's biggest year yet.
🧵 for some highlights and link to post
LEARNING 1: The conceit of an expert is a trap. Strive for a beginner’s mind and the energy of a novice.
Experience can often be a curse—the past is only mildly predictive of the future, and every scenario requires new techniques and insight. In novel situations, the novice tends to be at an advantage—their vitality and beginner’s mind lend themselves to faster adaptation.
Jul 18, 2023 • 5 tweets • 3 min read
With @MetaAI's the launch of Llama 2—@scale_ai will also be:
🌎 open-sourcing scale-llm-engine, our library for hosting and fine-tuning open-source LLMs
⚡️ releasing the fastest way to fine-tune Llama 2
💼 launching Scale Custom LLMs for enterprises
Read more in 🧵
We are open-sourcing scale-llm-engine, our library for hosting and fine-tuning open-source LLMs.
This can run on your own infra, as well as on Scale's cloud infrastructure.
🧵 thread of some of my favorite AI-generated product images from @scale_AI Forge
AI-generated advertising only gets better as we keep improving our underlying models
It works really well in conveying the feeling of cosmetic products.
Dec 25, 2022 • 5 tweets • 1 min read
Heard someone say “I don’t want to waste brain space on learning Chinese”
PSA—that’s not how it works at all.
Consistently *retrieving* information both deepens connections with the rest of your knowledge and frees up resources & working memory for more abstract thought.
🧵
Memorizing actually allows for new conceptual understanding, it’s not just rote BS.
And while there is some “wetware” limit based on the number of synapses, that limit is roughly the memory size of the movie of your entire life. It’s why some people can have photographic memory
Nov 27, 2022 • 14 tweets • 4 min read
I'm publishing a call to action: The AI War and How to Win It.
AI for national security will define the future of our world. Either the USA wins, or our authoritarian adversaries do.
I walk through The AI War, The China Threat, and How to Win It.
1/ We are launching a product we previewed 2 weeks ago—Scale Forge ⚒
We're enabling marketers to AI-generate UNLIMITED and INFINITELY CREATIVE product imagery for:
- brand campaigns
- ad creatives
- social media
- product images
See the product in the video!
Thread 🧵
2/ Scale Forge ⚒ is an AI-powered design studio that enables customers to create new product images that allow for high-fidelity brand preservation.
You can use one of our default products, or upload your own!
Oct 27, 2022 • 6 tweets • 5 min read
I wanted to preview of one of the coolest products from the @scale_AI labs.
We're enabling marketers to AI-generate UNLIMITED and INFINITELY CREATIVE images of their products for:
- ad creatives
- brand campaigns
- social media
Every image in this thread is AI generated🧵
The most creative ads are the ones we remember the best—they're striking, memorable, and cool.
With the new breakthroughs in AI, we can enable brands to unlock their imagination, and grow their customer base.
What are the most inspirational settings for your product?
Oct 24, 2022 • 13 tweets • 5 min read
.@scale_AI had our TransformX conference last week.
As part of that we announced a number of ⚡️new products⚡️ to unlock and operationalize AI for everyone—from startups to researchers and Fortune 500 companies to the US government.
Thread🧵
We announced the ✨Scale Applied AI✨ Suite.
Scale is at the forefront in advancing foundation models, especially applying them to specific tasks + industries.
These are real examples of how Scale is partnering with customers across industries.
May 30, 2022 • 5 tweets • 3 min read
Posting a memo I sent to the @scale_AI team back in 2019.
The core idea is that most organizations fall prey to a slow death of optimism, causing a slow, excruciating halt.
When we say things will take a long time, they will take a long time.
When we say things will take a short amount of time, they will take less time.
Dec 6, 2021 • 9 tweets • 2 min read
The nerds vs the cool kids, a short thread 🧵
The nerds are jealous of the cool kids for being, well, cool.
The cool kids are jealous of the nerds for the ability to build.
1/n
It’s not totally true the cool kids don’t build. The nerds build things and the cool kids build relationships.
Nerds are jealous of the connectivity, cool kids are jealous of the substance
2/n
Nov 26, 2020 • 6 tweets • 2 min read
Breaking down my recent post:
1/ @scale_AI, I interviewed everyone we gave an offer to for a long time. I wrote a memo to the company about what I look for. I wanted to share it with the community because I don’t think people do this enough.
Thread👇 alexw.substack.com/p/hire2/ I mainly screen for one key thing: giving a shit. To be more specific, there’s actually two things to screen for:
1. they give a shit about Scale, and 2. they give a shit about their work in general.