François Chollet Profile picture
Aug 3, 2019 3 tweets 1 min read Read on X
This wave of AI hype will have dire consequences. Not just for our field. For public safety.

Overselling the progress & capabilities of AI leads govts & companies to adopt shoddy "AI" solutions (which contractors are more than happy to sell). And blindly trust their predictions.
Misusing ML systems that work can be dangerous. Trusting systems that don't work (in some cases, systems that couldn't possibly work) may be an even more pressing issue.
Startups that are hyping up magical AI/AGI fantasies in order to raise money are one of the main causes of the problem. Because they shape the public discourse. They shape the perceptions of decision-makers. The media is more than happy to relay -- the media loves outrageous hype

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with François Chollet

François Chollet Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @fchollet

Jan 15
I'm joining forces with @mikeknoop to start Ndea (@ndeainc), a new AI lab.

Our focus: deep learning-guided program synthesis. We're betting on a different path to build AI capable of true invention, adaptation, and innovation. Image
Read about our goals here: ndea.com
We're really excited about our current research direction. We believe we have a small but real chance of achieving a breakthrough -- creating AI that can learn at least as efficiently as people, and that can keep improving over time with no bottlenecks in sight.
Read 6 tweets
Jan 15
People scaled LLMs by ~10,000x from 2019 to 2024, and their scores on ARC stayed near 0 (e.g. GPT-4o at ~5%). Meanwhile a very crude program search approach could score >20% with hardly any compute.

Then OpenAI started adding test-time CoT search. ARC scores immediately shot up.
It's not about scale. It's about working on the right ideas.

Like deep-learning guided CoT synthesis or program synthesis. Via search.
10,000x scale up: still flat at 0

Add CoT search, similar model scale: boom
Read 4 tweets
Dec 20, 2024
Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks.

It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task in compute ) and 87.5% in high-compute mode (thousands of $ per task). It's very expensive, but it's not just brute -- these capabilities are new territory and they demand serious scientific attention.Image
My full statement here: arcprize.org/blog/oai-o3-pu…
So, is this AGI?

While the new model is very impressive and represents a big milestone on the way towards AGI, I don't believe this is AGI -- there's still a fair number of very easy ARC-AGI-1 tasks that o3 can't solve, and we have early indications that ARC-AGI-2 will remain extremely challenging for o3.

This shows that it's still feasible to create unsaturated, interesting benchmarks that are easy for humans, yet impossible for AI -- without involving specialist knowledge. We will have AGI when creating such evals becomes outright impossible.
Read 8 tweets
Nov 9, 2024
When we develop AI systems that can actually reason, they will involve deep learning (as one of two major components, the other one being discrete search), and some people will say that this "proves" that DL can reason.

No, it will have proven the thesis that DL is not enough, and that we need to combine DL with discrete search.
From my DL textbook (1st edition), published in 2017. Seven years later, there is now overwhelming momentum towards this exact approach. Image
I find it especially obtuse when people point to progress on math benchmark as evidence of LLMs being AGI, given that all of this progress has been driven by methods that leverage discrete search. The empirical data is completely vindicating that DL in general, and LLMs in particular, can't do math on their own, and that we need discrete search.
Read 4 tweets
Oct 26, 2024
In the last Trump administration, legal, high-skilled immigration was cut by ~30% before Covid, then by 100% after Covid (which was definitely a choice: a number of countries kept issuing residency permits and visas). However illegal immigrant inflows did not go down (they've been stable since the mid-2000s).
If you're a scientist or engineer applying for a green card, you're probably keenly aware that your chances of eventually obtaining it are highly dependent on the election. What you may not know is that, if you're a naturalized citizen, your US passport is also at stake
The last Trump administration launched a "denaturalization task force" aiming at taking away US citizenship from as many naturalized citizens as possible, with an eventual target of 7M (about one third of all naturalized citizens). Thankfully, they ran into a little problem: the courts.
Read 10 tweets
Oct 20, 2024
When we say deep learning models operate via memorization, the claim isn't that they work like literal lookup tables, only being able to make sense of points that are exactly part of their training data. No one has claimed that -- it wouldn't even be true of linear regression.
Of course deep learning models can generalize to unseen data points -- they would be entirely useless if they couldn't. The claim is that they perform *local generalization*: generalization to known unknowns, to degrees of variability for which you can provide a dense sampling at training time.
If you take a problem that is known to be solvable by expert humans via pure pattern recognition (say, spotting the top move on a chess board) and that has been known to be solvable via convnets as far back as 2016, and you train a model on ~5B chess positions across ~10M games, and you find that the model can solve the problem at the level of a human expert, that isn't an example of out-of-distribution generalization. That is an example of local generalization -- precisely the thing you expect deep learning to be able to do.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(