Tweet

Jonathan Zittrain

@zittrain

May 30 • 43 tweets • 18 min read Twitter logo

https://twitter.com/DanHendrycks/status/1663474795865059329?s=20

Today, a crisp one-sentence open letter warning about existential AI threat: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

I did not sign the letter.

https://twitter.com/DanHendrycks/status/1663474795865059329?s=20

A lot of smart, thoughtful, genuinely brilliant colleagues have signed the letter, and it follows on earlier alarms about AI, including one from February worried that “Advanced AI could represent a profound change in the history of life on Earth ... .” nytimes.com/2023/03/29/tec…

At the risk of being the blockhead in the first ten minutes of the horror movie who confidently heads to the basement to show that the strange noises are entirely normal, I thought I’d explain why I didn't sign.

@ylecun

The February letter garnered over 1,800 signatories, including Elon Musk, though in a sign of our times an indeterminate number of signatures (including those of Xi Jinping and Meta chief AI scientist @ylecun) turned out to be fake. theguardian.com/technology/202…

That letter dramatically called for a six-month moratorium -- voluntary or if necessary by law (!) -- on further AI large language model development. To be sure, several weeks after calling for the pause, Musk appears to have gone all in on AI development. businessinsider.com/elon-musk-twit…

Existential risk from AI is not a new concept; imagining machines (or non-human humans more generally) that turn on us has a rich history.

(The quote below is from the December 1948 issue of TIME about the Mark III.)

Traditionally the lurid tools of our extinction have been only within the reach of a powerful few – say, those with the know-how, enriched uranium, and delivery vehicles for nuclear warheads; or, as some tell it, those who built and operate supercolliders. (!)

(So far CERN hasn’t created a black hole that’s devoured the Earth, or emitted a strangelet to turn it into a “shrunken dense dead lump of … strange matter”; the quickly-dismissed lawsuit against it called for an environmental impact statement that would have been for the ages.)

Signatories no doubt have different accounts of AI’s “existential risk.” Many would blanch at a Terminator-style story of killer robots. Some see AI achieving “superintelligence” – becoming smarter than humans are, individually and even collectively. en.wikipedia.org/wiki/Superinte…

Here's OpenAI's own description of superintelligence from last week. It's perhaps necessarily vague on what that is and how it might come about. It's clearer that the authors don't think today's AI systems count, so can develop more freely. openai.com/blog/governanc…

If you’d like the full Terminator argument, Holden Karnofsky’s essay from June 2022 is a good place to start. But it also starts by more or less assuming the premise of hostile superintelligence(s): cold-takes.com/ai-could-defea…

@mattyglesias

And @mattyglesias, writing a year ago -- ages in AI time -- is very much down with the Terminator analogy, because it communicates a justified fear of AGI to the public, even if its details don't happen to track what the experts worry about. slowboring.com/p/the-case-for…

But I've been brought up short by some of some arm-waving around how a superintelligence emerges: if AIs are getting better (“smarter”) over time, deep questions of what “smart” means notwithstanding, the argument is that it’s just a matter of time before they surpass us.

Before LLMs like GPT came about, the arguments about "better" were sometimes couched in analogies between processor power and brainpower, and how the former could overwhelm the latter, even as raw processors alone don't make minds, any more than a pile of brains do.

Particularly if AIs start coding their own successors – in theory, they very quickly level up. But AI implementations run the gamut. The first time we’ve really seen something that acts like a colloquial artificial intelligence has been w/“large language models” like GPT.

It’s amazing what’s come of pouring billions of fragments of humanity’s words into a big pot, performing several hundred millions of dollars’ worth of computational stirring, and then making some refinements through Q&A with the resulting model (“RLHF”). washingtonpost.com/technology/int…

Like many, the first time I tried out GPT-3 I was floored. Wait, this thing passes the Turing Test! Just like that! (Sure, the Turing Test is flawed, but still…) amazon.com/Turing-Test-Be…

Some of its most impressive characteristics might look modest. If I start it off with questions in regular case and ANSWER IN CAPS it will “know” to keep going in that style, despite no explicit training or code around upper and lower cases.

https://twitter.com/stanislavfort/status/1639731204307005443?s=20

And it even seems capable of cognition at times – quite a leap when it’s trained as an unsupervised “auto-regressive” model, i.e. simply predicting what tokens and words come next in a sentence.

https://twitter.com/stanislavfort/status/1639731204307005443?s=20

@MelMitchell1

(@MelMitchell1 has written an informative and accessible overview of evaluating reasoning in certain AI models.) aiguide.substack.com/p/why-the-abst…

https://twitter.com/dioscuri/status/1639068338520494080?s=20

If people can’t explain how GPT approximates some form of cognition in some cases – and for most meanings of testable explanation, they can’t – then it’s awfully hard to know how much better GPT can get in version 5 or 6 or 7.

https://twitter.com/dioscuri/status/1639068338520494080?s=20

https://twitter.com/TaliaRinger/status/1659370809042014208?s=20

Yet then GPT turns out to be lousy at something easy (for now), in ways that suggest that the way these models work is not, in fact, much like human cognition. (We're don't achieve thought thanks to reading every word of Reddit and … everything else.)

https://twitter.com/TaliaRinger/status/1659370809042014208?s=20

@ylecun

(@ylecun has written about the difference between existing model architectures and one that might recreate the sort of common sense that we take for granted in humans. openreview.net/pdf?id=BZ5a1r-…)

https://twitter.com/MovingToTheSun/status/1625156575202537474?s=20

It’s a strange moment to have chatbots that are so unbelievably good, head and shoulders above what came before, and also so clearly innately limited – geared for coherence rather than truth. What a time to be alive – and what a time to be not alive!

https://twitter.com/MovingToTheSun/status/1625156575202537474?s=20

@kevinroose

When Microsoft released a version of GPT within Bing, veteran tech columnist @kevinroose was blown away – well, more precisely, deeply creeped out. Bing, a.k.a. “Sydney,” gave 2001’s HAL-9000 a run for its money. nytimes.com/2023/02/16/tec…

https://twitter.com/fabianstelzer/status/1638506765837914114?s=20

I still can’t get over the fact that one of the few ways the bot-maker has of getting these bots to be nice is to … tell them to be nice, before turning the same mic over to a user, who can try to belay those orders.

https://twitter.com/fabianstelzer/status/1638506765837914114?s=20

Which brings us back to how these bots could appear to do harm: by a person simply asking them how best to do it, and by their trying to oblige by predictive-texting their way to a retread of a Terminator or similar script knocking around inside. vice.com/en/article/93k…

But it gets worse! In a development that an earlier me would have cheered, within the past month OpenAI added plug-ins to GPT – ways for it to not only answer text with text (or text with code), but to … run that code. Or operate OpenTable or Instacart. openai.com/blog/chatgpt-p…

GPT plug-ins are hugely generative, and it seems there’s no end of both cool and uncool things that people will do with them. Like the Internet itself! For the Internet, I’ve long argued that the good has outweighed the bad. en.wikipedia.org/wiki/Generativ…

OpenAI hopes to shape plug-ins to limit bad real-world impacts across the blood-brain barrier from the online playground. But imagine GPT being prompted to pull off a bomb scare – using Craigslist to do it. (Real story; AI piece only a thought experiment!) masslive.com/news/2023/05/f…

This sort of scenario, with later generations of LLMs both less monitored and more powerful and connected, worries me a lot. An AI doesn’t have to think for itself, contemplating escaping humans’ control, to do (or to be instructed to do) bad things. en.wikipedia.org/wiki/Instrumen…

Especially in a possible world where AI models are open sourced and run on laptops (old me would cheer), some wise and careful practices are needed to try to ensure they can’t so readily connect to make things just happen in the real world. semianalysis.com/p/google-we-ha…

But strangely, this seems just the kind of scenario that the OpenAI team does *not* so much worry about -- it's not in scope for existential risk from superintelligence.

And the superintelligence that *is* in scope has upsides that militate towards (carefully) building it, too.

@verityharding

As @verityharding points out on @Samfr's blog, though, it's the prominence of ChatGPT that lends talk of AI risk oxygen rn -- even though the risk, when described by those concerned, appears to be separate from GPT, and not concretely articulated beyond "human+ intelligences."

@timnitGebru

This open reply to the February AI risk letter by @timnitGebru, @emilymbender, @mcmillan_majora, and @mmitchell_ai also gets at this issue. dair-institute.org/blog/letter-st…

@random_walker

And this essay by @random_walker and @sayashk, also responding to the February open letter on AI risks, offers similar reasons to hold back. aisnakeoil.substack.com/p/a-misleading…

And even if we can walk and chew gum at the same time – worrying about the speculative risks as well as the ones right in front of us like bias and misuse – placing AI tools into the ranks of nuclear and biological weapons of mass destruction jumps the gun.

@afedercooper

(@afedercooper points out that climate change is conspicuously missing; perhaps an awkward artifact of intense computing's contribution to it; a desire to avoid an ancillary topic under controversy; or an implicit claim that its massive displacement and suffering ≠ existential.)

It jumps the gun because the problem is far more ill-defined -- except in projection of awful consequences -- than the workings of nuclear or biological proliferation, and the remedies to prevent "too much computing" at that register are, well, extreme. time.com/6266923/ai-eli…

From a legal-policy perspective, the regulatory perimeter is unbounded if enough computers (including virtual ones like those on computational blockchains like Ethereum) amount to a bunch of highly enriched uranium. If everything must be regulated, worldwide, nothing will be.

@iaeaorg

That's a lot harder than regulating uranium, and is why analogies to orgs like the @iaeaorg don't quite work for me yet. The IAEA has a distinct membership -- sovereign states -- and remit, including keeping an eye on nuclear facilities with the assent of those who operate them.

https://twitter.com/fchollet/status/1640896398311968773

So, as François Chollet succinctly put it after the February letter:

https://twitter.com/fchollet/status/1640896398311968773

Of course, in transformative tech it seems like there are only two phases in evaluating risk: too early to tell, and too late to do anything about it.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Enter Twitter Thread URL to Unroll

Jonathan Zittrain

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @zittrain

Jonathan Zittrain

Jonathan Zittrain

Jonathan Zittrain

Jonathan Zittrain

Jonathan Zittrain

Jonathan Zittrain

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!