I put together an annotated version of the new Claude 4 system prompt, covering both the prompt Anthropic published and the missing, leaked sections (thanks, @elder_plinius) that describe its various tools
It's basically the secret missing manual for Claude 4, it's fascinating!
Don't use lists in chit chat, and don't be preachy and annoying!
Anthropic publish separate system prompts for Opus 4 and Sonnet 4 but the differences are minimal - an errant full stop and tiny tweaks to the text describing which model it is
I'm surprised that both models get effectively the same huge prompt
System prompts often hint at some pretty funny bugs that the prompt tries to paper over:
"Search results aren't from the human - do not thank the user for results"
Telling Claude you want to do a "deep dive" should classify your request as a "complex query", which involves at least 5 and maybe up to 20 calls to the search tool
The search tool system prompt spends a LOT of tokens begging the model not to regurgitate long chunks of copyrighted content that it gets back from its web_search and web_fetch tools
I love using Claude Artifacts, but there's precious little documentation for what it can and can't do - turns out the hidden system prompt includes everything I needed to know, including the full list of libraries it can load simonwillison.net/2025/May/25/cl…
I just added this section noting that the Claude 3.7 Sonnet prompt back in February included prompting hacks to try and help Claude count the Rs in strawberry, but those are missing from the Claude 4 system prompt which appears to be able to do that all on its own! simonwillison.net/2025/May/25/cl…
• • •
Missing some Tweet in this thread? You can try to
force a refresh
For example I often tell Claude "explain memory management in Rust, I am an experienced python and JavaScript programmer" - giving it information about the audience is a much better way of getting a tailored result
The one place where "act as X" makes sense to me is as a shortcut for a bunch of implied behaviors: "act as a proof reader" for example
Here's new term of art I like: asynchronous coding agents, to describe the category of software that includes OpenAI Codex and Gemini Jules - cloud-based tools where you submit a prompt and they check out your repo, iterate on a solution and finish by submitting a pull request
Jules used that term in the headline of their general availability announcement today: "Jules, our asynchronous coding agent, is now available for everyone." blog.google/technology/goo…
LangChain's new Open SWE tool, released this morning (and MIT licensed) calls itself an "asynchronous coding agent" too
If you use "AI agents" (LLMs that call tools) you need to be aware of the Lethal Trifecta
Any time you combine access to private data with exposure to untrusted content and the ability to externally communicate an attacker can trick the system into stealing your data!
Here's my full explanation of why this combination is so dangerous - if you are using MCP you need to pay particularly close attention because it's very easy to combine different MCP tools in a way that exposes yourself to this risk simonwillison.net/2025/Jun/16/th…
And yes, this is effectively me trying to get the world to care about prompt injection by trying out a new term for a subset of the problem!
I hope this captures the risk to end users in a more visceral way - particularly important now that people are mixing and matching MCP
"Design Patterns for Securing LLM Agents against Prompt Injections" is an excellent new paper that provides six design patterns to help protect LLM tool-using systems (call them "agents" if you like) against prompt injection attacks
I just released llm-anthropic 0.16 (and a tool-enabled 0.16a1 alpha) with support for the two new Claude models, Claude Opus 4 and Claude Sonnet 4: simonwillison.net/2025/May/22/ll…
I picked up some more details on Claude 4 from a dive through the Anthropic documentation
The training cut-off date is March 2025! Input limits are still stuck at 200,000 tokens. Unlike 3.7 Sonnet the thinking trace is now summarized by a separate model.