Post

More from @robbensinger

Rob Bensinger ⏹️

@robbensinger

Dec 10, 2023

@robertwiblin @TheZvi I think EA does have a non-small directional bias (relative to optimal performance) toward preferring legible, relatively-unmediated-by-theory lines of reasoning over mechanistic models, "inside views", etc.

Which I suspect yields the "neglect second-order effects" behavior.

@robertwiblin @TheZvi It makes sense to be wary of pie-in-the-sky theories and to want "what if this is all BS?" sanity checks, but EA seems to me to go significantly too far in the opposite direction.

@robertwiblin @TheZvi In EA, "people can deceive themselves" often shades into "never believe that you have domain knowledge about a thing unless you can prove it".

Read 10 tweets

Rob Bensinger ⏹️

@robbensinger

Apr 4, 2023

I've been citing lesswrong.com/posts/uMQ3cqWD… to explain why the situation with AI looks doomy to me. But that post is relatively long, and emphasizes specific open technical problems over "the basics".

Here are 10 things I'd focus on if I were giving "the basics" on why I'm worried:

1. A common misconception is that the core danger is something murky about "agents" or about self-awareness.

Instead, I'd say that the danger is inherent to the nature of mental and physical action sequences that push the world toward some sufficiently-hard-to-reach state.

Call such sequences "plans". If you sampled a random plan from the space of all writable plans (weighted by length, in any extant formal language)...

Read 49 tweets

Rob Bensinger ⏹️

@robbensinger

Mar 10, 2023

@AndrewYNg

Responding to someone who said he agreed with @AndrewYNg at the time that worrying about smarter-than-human AI was "like worrying about overpopulation on Mars", but now he thinks Mars is starting to fill up:

It really was a uniquely bad argument at the time.

It pumps on a bunch of intuitions, without arguing for a single one of them:

- Overpopulation isn't a problem on Earth, but people panicked about it in the 1970s and after. By analogy, AI risk is supposed to be an inherently silly thing to worry about.

But doubly silly because it's "on Mars"; so it's a non-issue 𝘢𝘯𝘥 it's a non-issue for the distant future to worry about.

- Humanity today isn't putting much effort into colonizing Mars. There isn't a huge industry building moon bases and mining asteroids and dreaming of Mars.

Read 25 tweets

Rob Bensinger ⏹️

@robbensinger

Feb 10, 2023

Proposal: try to learn things about alignment by training models that ONLY output offensive content.

This tests exactly the same things as 'trying to get models to never say mainstream-offensive things', but makes it less likely alignment gets confused with 'make LLMs bland'.

"AI alignment" / "Friendly AI" is actually AGI notkilleveryoneism. There's a genuine danger in equating "our chatbot didn't cause us to get sued or cancelled" with "we solved the alignment problem", or in blurring the lines between alignment and "AI ethics" or broad "AI safety".

And there's a double danger in making it sound like alignment is about making LLMs politically correct and blandly corporate (rather than about gaining the understanding required to reliably aim future dangerously-capable AGI systems at targets without killing everyone).

Read 6 tweets

Rob Bensinger ⏹️

@robbensinger

Feb 10, 2023

I'm not a big fan of the "takeoff" analogy for AGI. In real life, AGI doesn't need to "start on the ground". You can just figure out how to do AGI and find that the easy way to do AGI immediately gets you a model that's far smarter than any human. Less "takeoff", more "teleport".

AGI capabilities can then "take off" from that point, but the takeoff begins from outer space, not from subhuman or par-human capability levels.

Inventing something involves a 0-to-1 leap at the point of going from "this doesn't work" to "this does work now".

This is like suddenly teleporting to a new point in space.

Your prototype probably isn't optimal, so you can then "take off" from that new point in space.

But the prototype doesn't have to resemble any precursors, and doesn't have to be "some past invention but 50% better".

Read 5 tweets

Rob Bensinger ⏹️

@robbensinger

Feb 4, 2023

"hmm, that would involve coordinating numerous people—we may be arrogant enough to think that we might build a god-machine that can take over the world and remake it as a paradise, but we aren't delusional"

This, but unironically!

Like, yes, point taken, this feels like a bizarre situation to be in. And I agree with lesswrong.com/posts/uFNgRumr… that there are sane ways to slow progress to some degree, which are worth pursuing alongside alignment work and other ideas to cause the long-term future to go well.

But just because something sounds like sci-fi doesn't make it harder in-real-life.

Building AGI may be hard. Given AGI, however, building something a lot smarter than humans is very likely easy (because humans are dumb, evolution didn't optimize us for STEM, etc.).

Read 11 tweets

Share this page!

Enter URL or ID to Unroll

Rob Bensinger ⏹️

Try unrolling a thread yourself!

More from @robbensinger

Rob Bensinger ⏹️

Rob Bensinger ⏹️

Rob Bensinger ⏹️

Rob Bensinger ⏹️

Rob Bensinger ⏹️

Rob Bensinger ⏹️

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!