TLDR: AI winter is here. Zuck is a realist, and believes progress will be incremental from here on. No AGI for you in 2025.
1) Zuck is essentially an real world growth pessimist. He thinks the bottlenecks start appearing soon for energy and they will be take decades to resolve. AI growth will thus be gated on real world constraints.
> "I actually think before we hit that, you're going to run into energy constraints. I don't think anyone's built a gigawatt single training cluster yet. You run into these things that just end up being slower in the world."
> "I just think that there's all these physical constraints that make that unlikely to happen. I just don't really see that playing out. I think we'll have time to acclimate a bit."
2) Zuck would stop open sourcing if the model is the product
> "Maybe the model ends up being more of the product itself. I think it's a trickier economic calculation then, whether you open source that."
3) Believes they will be able to move from Nvidia GPUs to custom silicon soon.
> "When we were able to move that to our own silicon, we're now able to use the more expensive NVIDIA GPUs only for training. At some point we will hopefully have silicon ourselves that we can be using for at first training some of the simpler things, then eventually training these really large models."
Opinion
Overall, I was surprised by how negative the interview was.
A) Energy - Zuck is pessimistic about the real world growth necessary to support the increase in compute. Meanwhile the raw compute per unit energy has doubled every 2 years for the last decade. Jensen also is aware of this, and it beggars belief that he does not think of paths forward where he has to continue this ramp.
Over at , the launch announcement read "At xAI, we have made maximizing useful compute per watt the key focus of our efforts."
So energy efficiency, algorithmic and otherwise are obvious areas where firms will be focused on. Zuck meanwhile is planning to move of Nvidia chips soon, and basically believing that the value in AI available to having a share of the most advanced compute clusters will declline.
B) AGI Negative
Zuck fundamentally
> does not believe the model, the AI itself, will be the product.
> It is the context, the network graph of friendships per user, the moderation, the memory, the infrastructure that is the product.
> Allows him to freely release open source models, because he has all of the rest of the pieces of user facing scaffolding already done.
An actual AGI
> where the a small model learns and accompanies the user for long periods
> while maintaining its own state
> with a constitution of what it can or cannot do
> rather than frequent updates from a central server
> would be detrimental to Meta’s business,
> would cause a re-evaluation of what they are doing
C) Summary
Zuck
> essentially settled into the trap of believing in incrementalism
> Advised by the smartest people in the world.
> Technically competent
> But he does not believe in states of the world where a 100x improvement from GPT-4 are possible, or that AGI is possible within a short timeframe.
> But he's also… he’s not raising capital.
The three people who are raising capital: Sam Altman, Elon Musk and Dario Amodei, are all on record expecting dramatic increases in capability. They could be hyping because they need higher valuations. I don’t know.
And so again we wait for GPT-5. If it is 10-100x as good as GPT-4, the current benchmarks won’t even work (how does one measure 100x as good on an MMLU scale of 1-100 ?).
If the models deliver value, and deliver large amounts of value exceeding manifold the capital deployed to develop them, then progress will continue. If not, not.
For me, the most exciting part of all of this is.. 🍿🍿🍿 drama. You get to see who is bluffing, with billions of dollars on the line, on a fairly short timeframe. And I, for the most part, am in it for sheer entertainment value in spectating potentially the greatest game mankind has ever played.
Happy Friday 🥂
I’d like to congratulate Dwarkesh for an excellent interview… and for not great ad reads as well 😜
The above was excerpted from my daily AI newsletter, full edition goes out Monday.
> too much capital and talent needed for next generation
> no real way to exit
> Reid Hoffmann looking to engineer an acquihire
> MSFT unwilling to bite at $4 bil
> so they engineered an earn-out deal
> allowing founders and research team to leave
> investors get back some capital over time
> surprising Reid managed to persuade everyone to get this done
> TikTok team trains Depth Anything
> LiDAR quality depth estimation from single photo frame
> Using teacher model-student model system
> 1st author Lihe Yang did this while on his internship (!) with the company
Out of training set estimation success: for parking, home automation, gaming, driving, office, architecture
Notably:
> Goal was to build a foundation model for depth estimation from a single image
> Did not use the classical method of getting accurate ground truth measured depth maps to train the model on
> Instead obtained a large (62 mil) image unlabelled dataset, which would form the basis of the “student” model
> Then built an annotation model to label this dataset
> Annotation model was built from a labeled 1.5 mil image dataset, the “teacher” model
> This worked because of scale! They had many failures along the way
No One Believed Elon when he said Tesla didn't need LiDAR or radar, images alone would be sufficient, and the TikTok team has proven him correct.
61 median followers => 50% of Twitter users have less than this
Below 1000 => you are near invisible to the algo. Posts will almost never be seen.
=> Get Premium
=> Find people you enjoy reading
=> Interact with them
=> for large accounts, reply to them within 2 minutes of their post, otherwise QT them. For small accounts whatever
=> join small spaces where you can have a chance to talk, or host one. Establish genuine connection
1000 => congrats, only 15% of Twitter gets here
1k to 10k
=> posting now makes sense as it gets seen and you have a chance of going viral
=> the only thing that correlates to growth is visibility
=> you need to post/QT (retweeting and replyguying won’t get you there)
=> just post more and experiment, then delete stuff that doesn’t catch the wind. You can tell how far tweet is going to go with views in first 5 minutes of posting
=> what works on Twitter is either a) emotion or b) information, on a narrow interest. You can have stellar Twitter game making jokes and putting emojis on dry technical topics
=> Twitter users are fast scrolling infovores. Stuff has gotta be dense
=> easy hacks: watch the timing of your audience. Mine is workday morning Pacific time. With a second bump in the evening
=> manage your tweets. Tweet once, then repost at your audience bump time
=> like all replies (it’s a read receipt)
=> respond to at least Twitter Blue
=> add interesting replies to the main thread. I imagine it as moderating a Reddit board temporarily, so you can keep a popular post on the TL indefinitely by continuing to comment and add to the thread
=> every reply, retweet, repost, additional thread tweet pushes content on the main TL, bring more readers in
=> you can grind your way up here
10k => only about 100k users are here, only 1 in 10k, officially lower success rate than to Stanford at 2 in 100.
10k - 100k => you can go viral every week at this point
=> you start making serious friends on app
=> the need to like/reply grind fades
=> you can promote content from smaller accounts, by retweeting or quotetweeting them
=> still looks insightful because your distribution is so much bigger
100k => suspect only 1000 - 10,000 accounts at this level
=> many are unable to post
=> every interesting post gets highly criticized
=> saying a mean thing to someone gets them destroyed online by your followers
=> posting a lot overwhelms the timeline and gets you blocks
=> almost every tweet has real world consequences