How to get URL link on X (Twitter) App
But now RL has grown to nearly the size of pretraining and scale-ups beyond this reveal its inefficiency. I estimate it would take a 1,000,000x scale-up from its current level to add the equivalent to a 1,000x scale-up of inference or a 100x scale-up in pretraining.
Scaling up AI using next-token prediction was the most important trend in modern AI. It stalled out over the last couple of years and has been replaced by RL scaling.
https://twitter.com/balesni/status/1945151391674057154While @balesni's thread asks developers to:
@arcprize 1. The released version of o3 is much less capable than the preview version that was released to a lot of fanfare 4 months ago, though also much cheaper. People who buy access to it are not getting the general reasoning performance @OpenAI was boasting about in December.
https://twitter.com/tobyordoxford/status/1881661919133859841Here is the revised ARC-AGI plot. They've increased their cost-estimate of the original o3 low from $20 per task to $200 per task. Presumably o3 high has gone from $3,000 to $30,000 per task, which is why it breaks their $10,000 per task limit and is no longer included.
They knew this was illegal and were worried about being arrested if they did it:
You could think of it as a change in strategy from improving the quality of your employees’ work via giving them more years of training in which acquire skills, concepts and intuitions to improving their quality by giving them more time to complete each task.
The idea is quite simple, drawing on a classic argument from computer science.
I call this a ‘record diagram’.
The latest of these, JADES-GS-z14-0, was discovered at the end of May this year. It is located 34 billion light years away — almost three quarters of the way to the edge of the observable universe. https://twitter.com/camrobjones/status/1790766472458903926First, let's be clear on a few things about the Turing Test.

Some more: 
Among the long list of signatories are 2 of the 3 main researchers behind deep learning and all 3 CEOs of the leading AGI labs.
From @marvinvonhagen's conversations with Bing. Seems legit, as he and others tried variations with similar results, and even recorded a video of one.