I also enjoyed this piece about how @philiplord drove the decision not to include subtitles for the Spanish language dialog to better represent bilingual culture remezcla.com/features/film/…
Here's a tweet from just before the movie came out with a preview of one of the early scenes - but I just noticed that the conversation attached to this tweet has a ton of extra insight from the animator on how he put the scene together
"Design Patterns for Securing LLM Agents against Prompt Injections" is an excellent new paper that provides six design patterns to help protect LLM tool-using systems (call them "agents" if you like) against prompt injection attacks
I put together an annotated version of the new Claude 4 system prompt, covering both the prompt Anthropic published and the missing, leaked sections (thanks, @elder_plinius) that describe its various tools
It's basically the secret missing manual for Claude 4, it's fascinating!
I just released llm-anthropic 0.16 (and a tool-enabled 0.16a1 alpha) with support for the two new Claude models, Claude Opus 4 and Claude Sonnet 4: simonwillison.net/2025/May/22/ll…
I picked up some more details on Claude 4 from a dive through the Anthropic documentation
The training cut-off date is March 2025! Input limits are still stuck at 200,000 tokens. Unlike 3.7 Sonnet the thinking trace is now summarized by a separate model.
I 'm seeing a lot of screenshots of ChatGPT's new 4o "personality" being kind of excruciating, but so far I haven't really seen it in my own interactions - which made me suspicious, is this perhaps related to the feature where it takes your previous chats into account? ...
Also a potentially terrifying vector for persisted prompt injection attacks - best be careful what you paste into ChatGPT, something malicious slipping in might distort your future conversations forever?
Gemini 2.5 Pro and Flash now have the ability to return image segmentation masks on command, as base64 encoded PNGs embedded in JSON strings
I vibe coded this interactive tool for exploring this new capability - it costs a fraction of a cent per image
Details here, including the prompts I used to build the tool (across two Claude sessions, then switching to o3 after Claude got stuck in a bug loop) simonwillison.net/2025/Apr/18/ge…
Turns out Gemini 2.5 Flash non-thinking mode can do the same trick at an even lower cost... 0.0119 cents (around 1/100th of a cent)
Notes here, including how I upgraded my tool to use the non-thinking model by vibe coding o4-mini: