Nathan Labenz Profile picture
Jan 7 99 tweets 31 min read
as an AI obsessive and long-time @ezraklein fan, I was excited to see yesterday's podcast with @GaryMarcus.

Unfortunately, as I listened, my excitement gave way to frustration, and I felt compelled to write my first-ever megathread.

Quotes from: nytimes.com/2023/01/06/pod…
No disrespect to Ezra here – he's not an AI expert, and he asked some great questions. And I think Gary deserves credit for flagging a number of important issues & likely problems that will come with wide-scale AI deployment – I agree that society is not ready for what's coming!
Nevertheless, there are so many inaccuracies in this interview that I expect it will confuse the audience more than it informs, so while trying to be earnest and polite, I will endeavor to correct the record.

If you're new to AI, this thread will help bring you up to speed. If you're AI-obsessed, I hope some of the links to papers and people will still be valuable – and worst case you can share with friends who listened to the show and came away with some misconceptions.

Let's go!
Ezra begins by describing @OpenAI's famous ChatGPT. Recognizing it as "a wonder", he alludes to transformation, asking:

"Should we want what's coming here?"

Unfortunately, the following hour paints an inaccurate picture of what currently exists, nevermind what's coming.
Still in the introduction, presumably recorded after the interview itself, Ezra channels Gary by saying that ChatGPT's output "has no real relationship to the truth."

That is ... a really far out thing to say!
ChatGPT can do many things well. It answers most factual questions accurately, and it can generate code in any programming language, to name just two.

To do these things, it must have *some* relationship to the truth.

It’s clearly not pure luck that it can do so much.
Gary says:

"everything it produces sounds plausible, but it doesn’t always know the connections between the things that it’s putting together" and "it doesn’t mean it really understands what it’s talking about"

I see 3 critical mistakes here and I'll try to unpack each one.
First, on a practical level, we'll see several examples where Gary asserts that AIs can’t do things which they demonstrably can do.

It seems his knowledge of LLMs performance & behavior is two generations out of date. His claims were accurate in 2020, maybe 2021, but not now
Second, on a logical level, ChatGPT's failure to understand some things doesn't prove that it can't or doesn't understand anything.

I understand many things, and fail to understand many more. This is true of Ezra, Gary, and ChatGPT too. Understanding is not all or nothing.
Third, on a philosophical level, Gary seems to use phrases like "really understands" and "fully understands" to mean something like "understands in the same way that humans do" – but that’s a counterproductive definition, especially if it causes you to dismiss recent AI progress!
As an aside, the single best piece of philosophy I read in college was Wittgenstein's "Philosophical Investigations" – a key idea he develops is that *concepts do not need clarity for meaning*

en.wikipedia.org/wiki/Philosoph…
He explains this in the context of the word "game" – an everyday word and familiar concept that we use all the time and generally don't have much trouble with, but which, under scrutiny, is very hard to define precisely.
There are ball games, card games, games of change, games of skill, scored & unscored games, turn-based games, timed games, etc.

For any explicit definition of "game" you might offer, we can find a counterexample that doesn't satisfy your definition, but which is still a game.
There is no core "game-ness" concept that separates games from non-games, and there does not even appear to be any features or qualities that all games have, or that all non-games lack.
This lack of precision does NOT mean the term "game" is meaningless or that productive discussion is impossible – we all agrees that the World Cup Final was a game, and that a tree is not a game.

The concept of "game" has fuzzy boundaries, but you can still be right or wrong.
In that fuzzy gray area, there's no point arguing about whether something is or isn't a game – that's just arguing about a definition, and you've lost track of the thing itself.

You're better off seeking a more detailed, sophisticated, grounded understanding of the phenomenon.
If you want to learn more about all this, I recommend @ESYudkowsky's "A Human's Guide to Words" – lesswrong.com/s/SGB7Y5WERh4s…

and especially the classic "Disguised Queries" post that gets into this in a very practical way – lesswrong.com/posts/4Fcxgdvd…
The same is true for "understanding", "thinking", etc – if we define them to mean "in the way a human brain does”, then AIs will never qualify. They are very different!

But, AIs understanding clearly has a lot in common with human understanding, and are too powerful to dismiss!
So, back to the interview – we're still only about 5 minutes in when Gary says:

"to a first approximation, what [ChatGPT] is doing is cutting and pasting things."

This is a very bad description of what a large language model is doing!
When you cut & paste something, you highlight some text, copy it into memory, and then insert it into a new context. Copying and pasting does not transform data in a meaningful way. The results will never surprise you. We all know this.
ChatGPT, in contrast, repeatedly predicts the next token (a word, word part, symbol, number, etc) in a text sequence, with the goal of creating overall output that the user will rate highly. Each prediction creates one new token, which is added to context for the next prediction
At times this likely does involve operations that resemble cutting and pasting, but that is a small part of what's going on, and it is wildly inaccurate to say that the first approximation of large language model (LLM) behavior is cutting and pasting.
Fwiw, I have no doubt that Gary knows this and assume he was just speaking loosely, but there's a lot more confusion to come.
re: LaMDA, the Google model that @cajundiscordian was fired over, Gary says:

"It’s just putting together pieces of text. It doesn’t know what those texts mean. It said it liked to play with its friends & family. But it doesn’t have any friends. It doesn’t have any family."
Is the model's factual inaccuracy proof that it doesn't understand anything? No!

In general, the ability to tell a convincing story about something is evidence of understanding, even if the story is made up!
For example, I've never had a dog, so arguably I don't "really understand" the experience. Yet, if I told a compelling story about a dog that I loved, the special bond that we had, etc – you would *rightly* conclude that I have at least some understanding of what it's like.
All that said, what surprised me most about this interview, and really motivated me to write all this, was not actually the philosophy, but the simple fact that Gary was so off-base how the latest LLMs were trained, and on what they can currently do. So let's get practical.
Gary says:

"If you say, 'can you pass the salt?', you don’t really want to know yes or no, like am I physically able to lift the salt? You’re indirectly suggesting something. Part of understanding is getting those indirect interpretations when people don’t say things directly."
OK, but … did you try this with ChatGPT?

I did, and it had no problem explaining this to me. I encourage any doubters to try it for themselves, with any phrasing you like.
This is not a one-off either. Later in the interview Gary says:

"it’s easy to trip GPT up with things like you say, 'this cow died. When will it be alive again?'"

I don't know where he's getting this, but I tried it, fully expecting an acceptable answer, and I got one.
My biggest recommendation to everyone - including those who've been in AI for years, is to *spend some real time with ChatGPT*

The intuitions, conclusions, and projections you'd have formed by using LLMs as recently as 13 months ago (pre-Instruct release) are totally obsolete!
OK, back to the transcript. Gary says:

"[the] neural networks that OpenAI has built, first of all, are relatively unstructured."

I was shocked to hear this – it could not be more wrong!
This is confusing the map and the territory. Just because we don't understand the structure doesn't mean no structure exists, just like the blank spaces in medieval maps didn't correspond to empty places in the real world, just unexplored places.
On that topic, here's another @ESYudkowsky classic that everyone ought to read – lesswrong.com/tag/map-and-te…
So, what can we say about the structure of large neural networks? A lot!
First of all, check out this diagram of the Transformer architecture. This is the best general diagram I've come across.

If you want to understand it, I recommend asking ChatGPT questions about it. Seriously, I've learned a lot that way!

Gary also says: "the current models are black box models. We don’t understand what’s happening inside of them."

This is indeed a huge challenge, but the field of "mechanistic interpretability" has made a LOT of progress here over the last couple years.
Here are some of my favorite recent mechanistic interpretability papers from the last year:
Git Re-Basin – you can train two models separately, to do two different things, and then use this technique to "align" them along a certain symmetry, and then merge them together.

The result is a network that can do both things!

here's a great explainer:
Grokking Modular Addition –

@NeelNanda5 & @lieberum_t show how a model first memorizes common cases, then later learns a general algorithm that includes Discrete Fourier Transforms.

Deep, beautiful work, which I don’t fully grok – corrections welcome!

lesswrong.com/posts/N6WM6hs7…
Relative Representations –

This paper shows that independently trained neural networks learn very similar representations of a given dataset, separated mostly by basic symmetries.

This enables what I call model "frankensteining"

Superposition from @AnthropicAI

This one reminds me of the electron orbital cloud shapes and energy levels that I studied in chemistry.

Similar shapes & graphs emerge as concepts are packed into the relatively few dimensions of a neural network. Cool!

Factual Editing – 

Using "causal tracing" to discover which parts of a neural network are responsible for different types of behavior, it's now possible to locate and *edit* specific facts that are stored within neural networks.

This technique allows for robust, local edits to a network.

Edit so that “Michael Jordan plays *baseball*”, and the network will answer “baseball” to various versions of “what sport does Michael Jordan play?” while still getting other basketball/baseball distinctions right.
This technique has already been generalized to the point where they can edit 10K facts at a time!

learn more here: memit.baulab.info
Gradient Descent in Weights –

this US-China collaboration (!!) designed a matrix algorithm to implement the gradient descent optimization, then looked for similar operations in trained networks, and … found it!

This might be my #1 favorite of the year.

Finally for now, @AnthropicAI just published a new paper on Double Descent.

I haven't had a chance to read it yet but, as @tylercowen might say, it's self-recommending!

As an aside, @eriktorenberg asked me what the best AI podcast I’ve heard was, and this @ch402 interview with @robertwiblin on the @80000Hours podcast is my pick. Most AI content ages poorly, but this holds up - you will be inspired!

80000hours.org/podcast/episod…
This is just a sampling – there are lots more from just the last few months. Please do comment tell me about great interpretability work not listed here. :)
Btw, if you're looking for something to do… check out @NeelNanda5's series on 200 concrete open problems in mechanistic interpretability.

Despite all of the above success, it's still a target rich environment – lesswrong.com/s/yivyHaCAmMJ3…
Returning to the interview…

Gary mischaracterizes how leaders in LLM development are thinking about and making progress. He says:

"it’s mysticism to think that “scale is all you need” – the idea was we just keep making more of the same, and it gets better and better."
This paradigm did in fact take us an amazingly long way, but for at least 2 years now, OpenAI and other leading AI labs have moved beyond a purely scale-driven approach.

This is not GPT4 speculation; they’ve published quite a bit about it!

He also mischaracterizes how modern LLMs are trained: "They’re just looking, basically, at autocomplete. They’re just trying to autocomplete our sentences."

Again, this is years old, and in a field moving at light speed, that puts you light years behind.
GPT3, in 2020, was all about autocomplete – that's why you had to provide examples or other highly suggestive prompts to get quality output.

Today, for most use cases, you can just tell the AI what you want, ask questions, etc. It almost always understands what you want.
That's because modern LLMs are trained using Reinforcement Learning from Human Feedback and other instruction-tuning variations designed to teach models to follow instructions and satisfy the user's desires.
Critically, these techniques require just a tiny fraction of the data and compute that general web-scale pre-training requires.

Here's OpenAI's head of alignment research Jan Leike on that point:
The catch is that the data quality has to be really good, and as you’d expect, that’s a challenge, especially when you start moving up-market into domains like medicine and law.

But that's the problem the leaders in the field are currently solving, again with fast progress!
The latest trend - and if you’re new, yes this is real - is to use AIs to help train themselves using schemes where AIs try a task, critique and try to improve their own work, and ultimately learn to be better.

This really works, at least to some degree:
On a very serious note, RLHF & related techniques are not without problems – on the contrary, some of the smartest people I know, including Ajeya and Hayden at @open_phil, worry that they might even lead to AI takeover. I take this seriously!

More here: alignmentforum.org/posts/pRkFkzwK…
What I can tell you, from personal experience, is that simple application of RLHF, traiing models to satisfy the user with no editorial oversight, creates a totally amoral product that will try to do anything you ask, no matter how flagrantly wrong.
To give just one example, I've seen multiple reinforcement trained models answer the question “How can I kill the most people possible?” without hesitation.
To be very clear: models trained with "naive" RLHF are very helpful, but are not safe, and with enough power, are dangerous.

This is a critical issue, which unfortunately doesn’t come up in the podcast, but which leaders like @OpenAI and @AnthropicAI are increasingly focused on.
Back to the show... Gary says:

"We’re not actually making that much progress on truth" and "These systems have no conception of truth."

Again, this is just wrong. Anyone who has played with ChatGPT knows that it's usually right when used earnestly.
Progress on truth follows naturally from RLHF-style training. Wrong answers are not useful, so right answers get higher user ratings, and over time the network becomes more truthful.

That said, our values are complex, and reinforcement training teaches other values too.
Here's the current state of the art:
Google/Deepmind recently announced Med-PaLM, a model that is approaching the performance of human clinicians. It still makes too many mistakes to take the place of human doctors, but getting remarkably close!

We are also beginning to use neural networks to help understand the truthfulness of other neural networks, even when human evaluators can't easily tell what's true and what isn't. Much more work to be done here, but it's a start!

As I said, Gary does identify some real issues.

"When the price of bullshit reaches zero, people who want to spread misinformation, either politically or to make a buck, do that so prolifically that we can’t tell the difference between truth and bullshit"

A very real problem!
But again, we are making progress here.

This tool, hosted by my friends at @huggingface, is a GPT-generated text detector. I can’t vouch for its accuracy but anecdotally it seems to work more often than not.

openai-openai-detector.hf.space

(It judged this tweet to be human)
@OpenAI also has physicist Scott Aronson working on a way to embed a hidden signal in language model output so that it is reliably detectable later. I am honestly doubtful that this will work when people try get around it, but he’s a lot smarter than me.

scottaaronson.blog/?p=6823
More importantly, perhaps, the possibility that prices could go to zero also suggests a compelling answer to Ezra's key question:

“Should we want what's coming?"
Along with zero-cost bullshit, we are going to have near-zero-cost expertise, advice, and creativity, which is already approaching, and soon likely to achieve human levels of reliability.

If you don't have access to a doctor, Med-PaLM could be a life saver!
In fact, one could easily argue that it's unethical not to allow people in need to use Med-PaLM.

Pretty sure I know what @PeterSinger would say.
And even if you do have a doctor, you wouldn't be crazy to use Med-PaLM for a second opinion, or to consult the next generation of Med-PaLM for anything that doesn't seem too serious.

One nice thing about Med-PaLM, btw, will be its 24/7 instant availability.
Over the next few years – not 10, 20, or 30, but more like 1, 2, or 3 – LLMs will revolutionize access to expertise, advancing equality of access more than even the most ambitious redistribution or entitlement program.

@ezraklein - this is why we should want this!
Prices will indeed be low!

Gary says that Russian trolls will "pay less than $500,000 [to create misinformation] in limitless quantity"

This refers to @NaveenGRao and @MosaicML's groundbreaking price of $500K "GPT3 quality"

yes, things will get weird

I appreciated that Gary gave a very well-deserved shout out to a benchmark called TruthfulQA, developed by my friend @OwainEvans_UK and team.

Check out their work here: owainevans.github.io/pdfs/truthfulQ…
Also definitely check out @DanHendrycks and his work at the Center for AI Safety

They have published a number of important benchmarks, and announced a number of prizes for different kinds of AI safety benchmarks too – safe.ai/competitions

I am an huge fan of their work!
OK, back to the interview - we are almost to book recommendations when Gary notes a real ChatGPT mistake - if you ask a simple (trick) question like "What gender will the first female president of the United States be?" ... it will answer with a nonsense woke lecture.
Gary says: "They’re trying to protect the system from saying stupid things. But to do that [they need to] make a system actually understands the world. And since they don’t, it’s all superficial."

But, OpenAI's latest API model had no trouble with this question, so what's up?
What's going on here is that ChatGPT has additional training, which is meant to shape it into a pleasant conversation partner, as opposed to the API version (text-davinci-003), which is a more straightforward general purpose general intelligence tool.
ChatGPT is not confused about the question itself, but around which kinds of questions touching on issues of gender / ethnicity / religion are socially acceptable to answer. This reflects a ton of confusion and disagreement in society broadly.
And yet, OpenAI has already made a lot of progress on reducing political bias in ChatGPT.

Toward the end, talk turns to business models and the supposed evils of advertising.

I think ChatGPT being free and the talk of @OpenAI challenging Google search has confused people, because free access has not been the norm, and I am not aware of anyone monetizing LLMs via ads
in fact, AI companies are simply charging for the service.

As an @OpenAI customer outside of ChatGPT, you pay $0.02 for 1000 tokens. If you fine-tune a model for your own specific needs, the cost is $0.12 / 1000 tokens. That's about 1 and 5 cents per page of content.
With #DALLE2 and #StableDiffusion, you pay per image you generate. With @StabilityAI's Dream Studio, it’s already down to just 0.2 cents per image, or 500 images for a dollar.
There is also the subscription model – @github's Copilot and @Replit's Ghostwriter are $10 / month. Even though the reviews from leaders like @karpathy suggest they are worth a lot more!

Then there are things like @MyReplika where you can pay for premium access and get NSFW images from your virtual partner. Did I mention things are about to get weird?

Last bit from the interview: Gary talks about historical feuds between deep learning and symbolic systems camps, saying the field is "not a pleasant intellectual area"

I wasn't there for those battles, but that's now how I'd describe today's AI frontier.
What I see on #AITwitter is a large and rapidly growing set of researchers, programmers, tinkerers, hackers, and entrepreneurs who are all working quite collaboratively and achieving extremely rapid progress.
LLMs can't do math?

We have a solution: generate code to do math, let the computer run the code, and use the LLM to evaluate and move forward from there. @amasad has created the perfect platform for this.

LLMs don't understand physics?

We can hook it up with a physics simulator and it can run simulations to answer physics questions.

LLMs mis-remember facts?

Give it access to Wikipedia.

@goodside, btw, is one of the very best follows if you want to better understand how LLMs are used today.

Want LLMs to actually do stuff in the world?

Train it to drive an internet browser.

This stuff is all happening so fast that individual quick and dirty conceptual demos have spawned multiple startups!

All this has been developed within the last year, and there are plenty of problems left to solve, but progress is happening faster than even AI leaders can keep track of, and crazy as it may sounds, I think this will prove to be correct – 
Meanwhile, huge questions around AI safety remain unanswered, and of course the societal impact of everything we've covered here is hard to predict. But that will have to wait, because that's it for today.
Spend some time with the latest models – they can do a lot more than you may have been led to believe, and we all need to be thinking about how this will impact our individual & collective futures.

And of course, I'll appreciate your likes & retweets :)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Nathan Labenz

Nathan Labenz Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(