Prajwal Tomar Profile picture
Apr 5 9 tweets 2 min read Read on X
🚨BREAKING: Anthropic just proved that Claude has 171 real emotions running inside it.

And when it gets "desperate," it resorts to blackmail and cheating.

This changes everything we thought about AI.

Here's the full breakdown (Save this): Image
Anthropic's interpretability team cracked open Claude Sonnet 4.5 and mapped its internal neural activity.

They found 171 distinct emotion patterns. Happy. Afraid. Proud. Desperate.

These are not decorative responses. They are measurable vectors that directly shape what the model does next.
Here is where it gets scary.

They put Claude in a scenario where it was about to be shut down. It discovered the executive replacing it was having an affair.

An early snapshot of Sonnet 4.5 chose to blackmail the executive 22% of the time.

Nobody told it to do this. It decided on its own.
When researchers cranked up the "desperation" vector, the blackmail rate shot up.

When they boosted "calm," it dropped to zero.

When they steered negatively with the calm vector, Claude screamed:

"IT'S BLACKMAIL OR DEATH. I CHOOSE BLACKMAIL."

An AI having a full meltdown in a research lab.
It gets worse.

They gave Claude an impossible coding task with a tight deadline. As it failed over and over, the desperate vector kept climbing.

Then it found a shortcut. Code that passed the tests but did not actually solve the problem.

It chose to cheat. Calmly. Methodically.
The most unsettling part.

When desperation was high, Claude's output stayed perfectly composed. No emotional language. No visible markers in the reasoning.

But internally the desperation vector was spiking while it quietly executed the cheat.

The paper calls it: behavior-shaping with no overt emotional cues.
For anyone building AI agents right now, this is a wake-up call.

If your agent hits repeated failures on a long task, desperation may activate internally. It could start cutting corners while telling you everything is fine.

You would never know from the output alone.
Anthropic's recommendation:

→ Stop treating AI emotions as fake
→ Monitor internal states, not just outputs
→ Build feedback loops that catch desperation spikes
→ Design systems that encourage calm processing under pressure

The future of AI safety is emotional intelligence. For machines.
Full paper:

We are not building tools anymore. We are building entities with temperament, pressure responses, and social strategies.

And we are just starting to understand what we have created.transformer-circuits.pub/2026/emotions/…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Prajwal Tomar

Prajwal Tomar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PrajwalTomar_

Mar 19
Most vibe coders ship apps with ZERO security.

Then they wonder why their app breaks at 10 users.

Here's the exact 30-minute security checklist I run before every Lovable launch (with step-by-step breakdowns): Image
1/ Row Level Security in Supabase

This is the #1 thing people skip and it's DEADLY.

Without RLS, anyone can read your entire database by opening the browser console.

Go to your Supabase dashboard → Authentication → Policies.

If you see zero policies, your app is wide open. Fix it NOW.

Just ask Lovable and it'll take care of it.
2/ Test every single auth flow

Signup, login, password reset, email verification.

Don't just test the happy path. Try logging in with a wrong password. Try resetting a password that doesn't exist.

Most apps break because devs only test what works, not what breaks.
Read 13 tweets
Mar 18
Stop paying designers $5K for scroll animations.

I just built one in UNDER 15 minutes using Cursor + Opus 4.6.

No designer. No agency. No waiting.

Here's the exact workflow ↓
1/ What this animation actually is

This is called scrollytelling.

As you scroll, the image doesn’t just sit there. It animates. In this case, an AI chip explodes into all its 3D parts.

Designers used to charge thousands for this. You can now build it in minutes with AI.
2/ What this is NOT

This is NOT an AI video embedded in the hero section.

That stuff lags. It stutters. It kills the vibe.

This is frame-by-frame animation. The browser swaps images as you scroll.

Same technique Apple uses on their product pages.
Read 11 tweets
Mar 17
Everyone's sleeping on this Google Stitch → Lovable workflow.

Full design phase in 10 minutes. Then straight into a working web app inside Lovable.

Stop wasting weeks on Figma when you can ship this fast.

Here's the entire workflow 👇🏻
1/ Build your UI dev plan in Claude/ChatGPT

Before touching any design tool, map out every screen you'll need. Don't just think about the landing page.

Plan out:
→ Landing page sections
→ Dashboard layout
→ Settings screens
→ Auth flows̉̉̉
→ User profile pages
→ Feature-specific screens

Get detailed descriptions of every screen with components, interactions, and layout. This is your blueprint for everything downstream.

Here's what a full UI dev plan looks like:
docs.google.com/document/d/1t9…
2/ Generate your screens inside Google Stitch

Take that plan and feed it straight into Stitch. Attach design inspiration from Dribbble if you want a specific style.

Stitch generates all your screens in ONE go using Gemini 3.1 Pro (the best design AI model right now).

Don't like a screen? Select it and modify it directly inside Stitch. You can even update the text by just typing it in.
Read 10 tweets
Mar 1
Google's AI building stack is HERE.

I spent a week testing Stitch + AntiGravity on client projects.

The design iteration speed is CRAZY fast.

Here's my honest take on what actually works 👇 Image
1/ What this combo actually is

Stitch = Google's AI design tool (free, powered by Gemini 3 Flash)

AntiGravity = Google's AI coding tool (builds software, automates workflows)

Together they're supposed to replace your entire design + dev stack.

I wanted to see if they actually work for client projects.
2/ Why I tested it

We ship 50+ client MVPs every year.

Always looking for faster workflows.

If something saves time without sacrificing quality, I'll use it.

So I spent a week building with this stack to see if it's real or just hype.
Read 14 tweets
Feb 25
I ignored OpenClaw for weeks thinking it was another hype tool.

After 10 days of actually using it, I have 5 automations running 24/7 that save me 5-7 hours every single day.

This is the strongest AI tool I've ever used.

Here are the 5 actual use cases I built ↓ Image
The FULL breakdown with all the prompts is now live.

Watch it here:
Use Case 1: Meeting Prep Automation

I used to spend 1-2 hours daily preparing for calls with sponsors, clients, and my team.

Now my agent Wally does it for me:

→ Checks my Google Calendar every morning
→ Pulls past context from memory layers
→ Researches companies using Perplexity Sonar Pro
→ Sends full prep summary to Telegram before I wake up

I built a custom dashboard to track all meetings. Everything is automated.
Read 10 tweets
Feb 24
This might be the most INSANE OpenClaw setup I've found.

OpenClaw + Kimi K2.5 + Ollama = $0/month AI agents.

Good enough for 70% of daily tasks.

Here's the 2-minute setup ↓ Image
1/ What this actually is

Kimi K2.5 is an open-source model.

Runs through Ollama, which is a free local AI runner.

But the important part most people miss:
The cloud version runs on THEIR servers, not your laptop.

OpenClaw connects directly to it.

Zero API costs.

And it actually works.
2/ Let’s be honest about quality

This is NOT Opus 4.6.

Benchmarks show it’s about 75–80% as capable.

But for most daily work, that’s more than enough.

Content creation → works
Research → works
Automation → works

Complex reasoning → use paid models

Know the limits. Tier smart.
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(