Richard Ngo Profile picture
Jul 9 11 tweets 5 min read Read on X
Ideologies very often end up producing the opposite of what they claim to want. Environmentalism, liberalism, communism, transhumanism, AI safety…

I call this the activist‘s curse. Understanding why it happens is one of the central problems of our time.

Twelve hypotheses:
1. Adverse selection on who participates. The loudest alarm is probably false, and the loudest activist is probably crazy.
2. Entrenchment. Accelerating in one direction creates pushback in the opposite direction, which eventually overpowers you.
lesswrong.com/posts/B2CfMNfa…
3. Perverse internal incentives. When your whole identity is wrapped up in a problem, it getting solved is terrifying.
4. Cowardice. If you actually try, failure is your responsibility. But failing without trying lets you still feel good about yourself: astralcodexten.com/p/book-review-…
5. Territoriality and infighting while trying to keep the movement pure. “What makes an outgroup? Proximity plus small differences”:
6. Respectability politics. You become your own side’s harshest critic to curry favor with outsiders.slatestarcodex.com/2014/09/30/i-c…
7. Perverse external incentives. You’ll be rewarded for doing what’s worst for you, because that’s what’s most interesting/funny/outrageous:
8. The dark forest hypothesis. When you’re too loud, the real powers come out and eat you.slatestarcodex.com/2014/12/17/the…
9. Trauma from fighting the world. “If you gaze long into an abyss” you become a cynic.
10. Ossification. Organizations eventually become so burdened by layers of rules, cruft, and “organizational scar tissue” that they’re net negative for their own goals: overcomingbias.com/p/what-makes-s…
11. Goodhart’s law. When you optimize a proxy hard enough, it eventually diverges strongly from what you really care about.
12. “Don’t crash into the tree”: you get more of what you pay attention to. Fear-driven motivation doesn’t work.
Thread inspired by a conversation with @HiFromMichaelV and @jessi_cata. They called it pessimization. Some of Michael’s thoughts:


In general, “perverse dynamics” which lead people towards exactly what they least wanted seem very fundamental and powerful.podcast.clearerthinking.org/episode/028/mi…
@HiFromMichaelV @jessi_cata More on environmentalism in particular:

Also apparently #12 is called “target fixation”: . H/t @lasernite. en.m.wikipedia.org/wiki/Target_fi…
@HiFromMichaelV @jessi_cata @lasernite Many great additional suggestions in the comments; I’ll link some here.

This one is kinda related to “the loudest activist is probably crazy”: regular common sense tends to be what solves problems.
@HiFromMichaelV @jessi_cata @lasernite This one is related to the dark forest hypothesis: successful movements make themselves targets for sociopaths.

Strongly recommend @Meaningness’s essay on this:

And also the idea of “bootleggers and baptists” coalitions: meaningness.com/geeks-mops-soc…
en.m.wikipedia.org/wiki/Bootlegge…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Richard Ngo

Richard Ngo Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @RichardMCNgo

Jun 10
Eleven opinions on AI risk that cut across standard worldview lines:
1. The biggest risks are subversion of key institutions and infrastructure (see QT) and development of extremely destructive weapons.
2. If we avoid those, I expect AI to be extremely beneficial for the world.
3. I am skeptical of other threat models, especially ones which rely on second-order/ecosystem effects. Those are very hard to predict.
4. There’s been too much focus on autonomous replication and adaptation; power-seeking “outside the system” is hard. See lesswrong.com/posts/xiRfJApX…
5. “Alignment” is a property of models not a property of research. I support any research that tries to understand neural networks on a deep scientific level.
6. Open-source is super useful for this, and will continue to be net-positive for years more (perhaps indefinitely).
Read 6 tweets
Jun 5
My former colleague Leopold argues compellingly that society is nowhere near ready for AGI. But what might the large-scale alignment failures he mentions actually look like? Here’s one scenario for how building misaligned AGI could lead to humanity losing control. THREAD:
Consider a scenario where human-level AI has been deployed across society to help with a wide range of tasks. In that setting, an AI lab trains an artificial general intelligence (AGI) that’s a significant step up - it beats the best humans on almost all computer-based tasks.
Throughout training, the AGI will likely learn a helpful persona, like current AI assistants do. But that might not be the only persona it learns. We've seen many examples where models can be "jailbroken" to expose very different hidden personas.
Read 19 tweets
Apr 26
Environmentalism and its consequences have been a disaster for the human race. 1/N
Environmentalism and its consequences have been a disaster for the human race. 2/N
Environmentalism and its consequences have been a disaster for the human race. 3/N
Read 7 tweets
Feb 20
So apparently UK courts can decide that two unrelated jobs are “of equal value”.

And people in the “underpaid” job get to sue for years of lost wages.

And this has driven their 2nd biggest city bankrupt.

Am I getting something wrong or is this as crazy as it sounds?
There are a bunch of equal pay cases, but the biggest is against Birmingham City Council, which paid over a billion pounds in compensation because some jobs (like garbage collectors) got bonuses and others (like cleaners) didn’t. Now the city is bankrupt.

theguardian.com/society/2023/s…
Here’s another that seems ridiculous: *warehouse operatives* and *sales consultants*, at a private company, in a unanimous decision!

Why not go all the way to central planning? Honestly at this point it feels like the UK is *trying* to immiserate itself.

hrmagazine.co.uk/content/news/n…

Image
Image
Read 7 tweets
Dec 3, 2023
In my mind the core premise of AI alignment is that AIs will develop internally-represented values which guide their behavior over long timeframes.

If you believe that, then trying to understand and influence those values is crucial.

If not, the whole field seems strange.
Lately I’ve tried to distinguish “AI alignment” from “AI control”. The core premise of AI control is that AIs will have the opportunity to accumulate real-world power (e.g. resources, control over cyber systems, political influence), and that we need techniques to prevent that.
Those techniques include better monitoring, security, red-teaming, stenography detection, and so on. They overlap with alignment, but are separable from it. You could have alignment without control, or control without alignment, or neither, or (hopefully) both.
Read 4 tweets
Dec 2, 2023
Taking artificial superintelligence seriously on a visceral level puts you a few years ahead of the curve in understanding how AI will play out.

The problem is that knowing what’s coming, and knowing how to influence it, are two very very different things.
Here’s one example of being ahead of the curve: “situational awareness”. When the term was coined a few years ago it seemed sci-fi to most. Today it’s under empirical investigation. And once there’s a “ChatGPT moment” for AI agents, it will start seeming obvious + prosaic.
But even if we can predict that future agents will be situationally aware, what should we do about that? We can’t study it easily yet. We don’t know how to measure it or put safeguards in place. And once it becomes “obvious”, people will forget why it originally seemed worrying.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(