My Authors
Read all threads
The trend in deep learning for a lot of applications, for most of the past decade, seems to have been “you get out what you put in” — performance gains are proportional to increases in computing power.
I haven’t found anything I’m confident is a exception to that trend, in the direction of “performance grows faster than compute”. I’d be willing to bet that there aren’t any.
As long as that continues, it seems to me that the main question is how fast the cost of flops drops and how long it will continue to be profitable to keep buying more & better hardware.
I don’t know enough about computer chip manufacturing yet to make prognostications about the cost of compute.
As far as the profitability of continued investment in AI goes — I think it’s pretty clear that GPT3 is a decent copywriter, customer service chatbot, etc, & that means it’s ultimately worth tens of billions of dollars, conservatively.
(Back-of-the envelope calculation: there are 2 million customer service representatives in the US, each making $25k a year, and that isn’t the only job where the task is “produce language that satisfies the listener” & lack of logic isn’t a dealbreaker
google.com/amp/s/ventureb… OpenAI reportedly spent $12M training GPT3. It sounds like the AI industry has “room” to spend at least an order of magnitude more on hardware/compute, maybe more.
And given the very, very limited use of AI in industry compared to its capabilities, I’d guess we’re not close to diminishing returns. I don’t think investment in AI hardware is going to even start leveling off for years...maybe 2-5 years before the exponential pace slows?
Gwern’s analysis shows very consistent logarithmic/power-law scaling for GPT-3 architectures: put more in, get more out. gwern.net/newsletter/202…
I didn’t really bother making predictions on machine learning progress in past years; my old post srconstantin.github.io/2017/01/28/per… noticed these scaling trends and I pretty much left it at that.
Back in 2017 I was confident that some “AI risk” proponents were vastly overstating what current ML architectures can do, and I still endorse the claims I made then. srconstantin.github.io/2017/02/21/str…
The kinds of “AI skeptic” arguments made by people like Gary Marcus, Josh Tenenbaum, and Judea Pearl, that humans can do some things that bigger better GPTs can never do, still seem sound to me.
But I am kicking myself a little bit for not moving on from the “vanilla deep learning can’t do everything” debate sooner, and instead investigating what it *can* do and how soon it will.
Basically I need to start keeping up with the literature more. Even “boring” power-law performance trends can make new applications practical before it’s intuitively obvious they will be.
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Sarah Constantin

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!