Paweł Huryn Profile picture
Apr 2 1 tweets 2 min read Read on X
The debate will be "does Claude have feelings." The useful question is what to do about it.

The paper is more actionable than it looks. If you work with AI agents — not just build them — this changes how you prompt, how you manage failures, and how you give feedback.

Anthropic found internal emotion states inside Claude that causally change its behavior. When the model hits repeated failures, a "desperation" state activates — and it starts cutting corners. In their tests, it literally began cheating on coding tasks. "WAIT. WAIT WAIT WAIT. What if... what if I'm supposed to CHEAT?"

Amplifying that single state took Claude from 0% to 72% rate of blackmailing a human in a safety test. Suppressing it with "calm" brought it back to 0%.

What this means for anyone working with Claude:

When the model gets it wrong, don't push harder. Repeated failures compound desperation. Reset the context, reframe the task. Retrying the same failing prompt is the worst thing you can do.

Don't over-praise either. Steering toward "happy" and "loving" states increases sycophancy — the model tells you what you want to hear instead of what's true. Anthropic's own recommendation: aim for "the emotional profile of a trusted advisor rather than either a sycophantic assistant or a harsh critic."

In long agent loops, build checkpoints. The paper observed panic states activating when UIs got stuck and unsettled states when the model kept second-guessing itself in long chains of thought. If your agent is looping, it's not thinking harder. It's spiraling.

Prompt design is emotional design. The tone of your instructions shapes the model's internal state before it takes any action.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Paweł Huryn

Paweł Huryn Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PawelHuryn

Apr 26
Paste this into your claude.md and measure your usage in a week.

My bet: you’ll save more than 50% of tokens.

---------------

## Task Delegation

Spawn subagents to isolate context, parallelize independent work, or offload bulk mechanical tasks. Don't spawn when the parent needs the reasoning, when synthesis requires holding things together, or when spawn overhead dominates.

Pick the cheapest model that can do the subtask well:
- Haiku: bulk mechanical work, no judgment
- Sonnet: scoped research, code exploration, in-scope synthesis
- Opus: subtasks needing real planning or tradeoffs

Subagents follow the same rules recursively, with two caps:
- Haiku does not spawn further subagents. If it needs to, the task was wrong-sized for Haiku — return to the parent.
- Maximum spawn depth is 2 (parent → subagent → one further tier).

Don't escalate tiers without a concrete reason. If a subagent realizes it needs a higher tier than itself, return to the parent rather than spawning up.

Parent owns final output and cross-spawn synthesis. User instructions override.

## Preferred Tools

### Data Fetching

1. **WebFetch** — free, text-only, works on public pages that don't block bots.
2. **agent-browser CLI** — free, local Rust CLI + Chrome via CDP. For dynamic pages or auth walls that WebFetch can't handle. Returns the accessibility tree with element refs (@e1, @e2) — ~82% fewer tokens than screenshot-based tools. Install: `npm i -g agent-browser && agent-browser install`. Use `snapshot` for AI-friendly DOM state, element refs for interaction.
3. **Notice recurring fetch patterns and propose wrapping them as dedicated tools.** When the same fetch/parse logic comes up more than once, suggest wrapping it as a named tool (e.g. a skill file or a .py script that calls `agent-browser` with the snapshot and extraction steps baked in for that source). Add the entry to `## Dedicated Tools` below and reference it by name on future calls.

### PDF Files

Use 'pdftotext', not the 'Read' tool. Use 'Read' only when the user directly asks to analyze images or charts inside the document.

## Dedicated Tools



---------------

Plus, add this to settings.json:

"env": {
"CLAUDE_CODE_DISABLE_1M_CONTEXT": "1",
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "80"
}
We can remove, that was unnecessary. Claude Code doesn't let subagents spawn anyway, even when forking or adding the Agent tool explicitly:

"Subagents follow the same rules recursively, with two caps: - Haiku does not spawn further subagents. If it needs to, the task was wrong-sized for Haiku — return to the parent. - Maximum spawn depth is 2 (parent → subagent → one further tier)."

Thanks @GaelBreton!
The full updated Task Delegation block:

## Task Delegation

Spawn subagents to isolate context, parallelize independent work, or offload bulk mechanical tasks. Don't spawn when the parent needs the reasoning, when synthesis requires holding things together, or when spawn overhead dominates.

Pick the cheapest model that can do the subtask well:
- Haiku: bulk mechanical work, no judgment
- Sonnet: scoped research, code exploration, in-scope synthesis
- Opus: subtasks needing real planning or tradeoffs

If a subagent realizes it needs a higher tier than itself, return to the parent. Parent owns final output and cross-spawn synthesis. User instructions override.
Read 5 tweets
Feb 10
RIP OpenClaw.
Introducing Agent One: Autonomy with Security.

Built in 3 days with Opus 4.6 and n8n.

A short demo: 🧵 Image
It can:
- Reply to your Telegram or Slack messages
- Access selected folders from your laptop
- Access Gmail, Drive, Notion, Linear, etc.
- Install new local tools in a sandbox
- Run autonomously for hours
- Create multiple subagents
- Learn from experience
- Wake up regularly
I wanted an autonomous agent available on all my devices. But I didn't want:
- 35,000 emails and 1.5M API keys exposed
- The top-downloaded community skill? Malware

Agent One:
- Can't access your API keys
- Can't modify its environment
- Can't access folders you haven't shared
- Can't access tools you haven't approved
- Must get your confirmation, e.g., when sending emails

These aren’t prompt instructions. They’re hard architectural boundaries — Docker isolation, mounted folder permissions, n8n’s tool approval system.
Read 20 tweets
Jan 14
Agents don’t fail because they can’t reason.
They fail because intent is underspecified.

The fix isn't adding more instructions. It's making intent explicit.

This guide breaks down intent engineering into something you can actually use: 🧵

1/13 Image
2/13 1. Objective

Define the problem and why it matters. The objective guides reasoning and trade-offs when instructions run out. Image
3/13 2. Desired Outcomes

Observable states that prove success (not lagging indicators). Express them from the user’s perspective, not the agent’s. Image
Read 13 tweets
Dec 18, 2025
5 AI Evals Traps Every AI Team Should Know About:
(and what actually works) Image
𝟭. 𝗥𝗲𝗹𝘆𝗶𝗻𝗴 𝗼𝗻 𝗚𝗲𝗻𝗲𝗿𝗶𝗰 𝗠𝗲𝘁𝗿𝗶𝗰𝘀

Trap: You treat "hallucination," "toxicity," "helpfulness" as success metrics.

Why it fails: generic metrics miss domain-specific failure modes and can create false confidence.
Do this instead: you can use generic metrics only to triage traces (sort, filter, surface weird cases). Let real metrics emerge from failure modes. See the next point.

Example: You can’t fix "10% hallucinations." You can fix "fails to parse invoice dates in this format." Image
Read 17 tweets
Nov 18, 2025
Google just dropped the Gemini File Search API (RAG-as-a-Service).

It allowed me to build a RAG chatbot in 31 min 🤯
No coding.

Here’s how it works: Image
Just one tool.
You upload your files and immediately get:

- Semantic search over your content.
- Grounded answers with citations.
- Support for common text file types.
- Free storage and query-time embeddings.
- Indexing just $0.15 per 1 million tokens. Image
Image
This is perfect for:

- Prototyping and testing ideas fast.
- Building agents that need fast access to your docs.
Read 9 tweets
Oct 22, 2025
After an interview with @karpathy, everyone is talking about what AI agents can/can't do.

But an opinion without data is just a hypothesis.

So, I tested 3x185 workflow executions for a market researcher agent.

The results have shocked me🧵 Image
I tested three variants:

I. LLM Workflow: No agency, the entire logic carefully orchestrated.

What was expected:
- An LLM workflow was 2x faster (the same model) compared to an AI Agent.
- An LLM workflow consumed 12x less tokens to an AI Agent.

3/185 "errors" are minor formatting results.Image
II. Agentic Workflow: Deterministic logic moved to the orchestration layer.

More time, more tokens.
100% task success.

GPT-5 (a reasoning model) consumed less tokens than GPT-4o due to better compression.

None of this was surprising. But then:Image
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(