, 10 tweets, 3 min read
My Authors
Read all threads
Oh wow. Approaches like Meena always leave me so heartbroken about this direction in #AI

Because on one hand, it's a tour de force. 341GB of source text. 2.6B param DNN model. $1.4M in compute time over a calendar month. This has not been done before!

However... 1/
On the NLP side, I'm heartbroken because this approach does not model semantics of the conversation, i.e. the system _doesn't know what it's talking about_, regardless of all the 341GB of text dropped into it. 2/
And in a sense, this is not unexpected. The lack of semantics was already clear with GPT, and hopefully Google will release this model to the public to play with as well. But even the published info is informative enough. 3/
So the result is an NLP system that, as ZD Net quips, "produces some of the banalest exchanges ever between two interlocutors." zdnet.com/article/google…

Harsh, yes.

But even a non-expert, popsci publication sees that the system is clearly not even doing smalltalk. 4/
And I do *not* mean to dump on this system, it's a super interesting project, of unprecedented scale.

But the ML approach being oversold as a bot that can "chat about anything" - when it can't, because it has no understanding. 5/
And the *other* thing that breaks my heart is: if we spend this much money on #AI, shouldn't we be pursuing approaches that *do* have an explicit semantic representation?

On AI does *could* understand what it's talking about? 6/
It would be much more difficult, historically it has been. And who am I to tell other people how to spend their money. :)

But just imagine if we spent $1.4M on trying different ways to model semantic understanding - how beneficial that could be in the long run. 7/
(And yes, some people insist that any such model "must" contain latent semantics, because it can hold on to topics or keep track of indexicals, etc. But that's not the same thing at all, because memory != semantics or understanding, but that's a longer rant.) 8/
So while such a tour de force is a magnificent sight to behold, I'm hoping that new gens of AI research will focus more on work that brings us closer to semantics, rather than just deep correlations. 9/
(And maybe on techniques that don't require multi-million-dollar budgets to try a single thing, for the sake of grad students, postdocs, and early-career researchers everywhere :) )

Anyway, rant over! :) 10/10
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Robert Zubek

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!