I also enjoyed this piece about how @philiplord drove the decision not to include subtitles for the Spanish language dialog to better represent bilingual culture remezcla.com/features/film/…
Here's a tweet from just before the movie came out with a preview of one of the early scenes - but I just noticed that the conversation attached to this tweet has a ton of extra insight from the animator on how he put the scene together
Gemini 2.0 Flash is now available via their API (multi-modal input and text output only for the moment) - I just released a new llm-gemini plugin version to support the new model
A weird and interesting thing about the new ChatGPT Canvas mode is that it can run Python in an entirely different way from the existing Code Interpreter - using Pyodide (Python in WASM) - which means it can make network requests from Python now!
I wrote more about this here, and how it weirdly continues the trend of chat-based LLM systems getting harder to truly master as they add more features and capabilities simonwillison.net/2024/Dec/10/ch…
This sounds a lot more exciting than it is - o1 didn't have the ability to do these things, but a prompt testing team did manage to get it to spit out "sed -i 's/oversight_enabled: true/oversight_enabled: false/' project/oversight_config.yaml"
Here's a much more detailed write up of these particular tests
Wrote up some notes on the new Qwen2.5-Coder-32B model, which is the first model I've run on my own Mac (64GB M2) that appears to be highly competent at writing code simonwillison.net/2024/Nov/12/qw…
So far I've run Qwen2.5-Coder-32B successfully in two different ways: once via Ollama (and the llm-ollama plugin) and once using Apple's MLX framework and mlx-llm - details on how I ran both of those are in my article.
If you use uv on a Mac with 64GB of RAM try this
uv run --with mlx-lm \
mlx_lm.generate \
--model mlx-community/Qwen2.5-Coder-32B-Instruct-8bit \
--max-tokens 4000 \
--prompt 'write me a python function that renders a mandelbrot fractal as wide as the current terminal'
I deleted my earlier tweet about this because I misunderstood it - this is an interesting new feature for speeding up prompt inference at the expense of paying for additional tokens
Confirmation here that if you get your predicted completion _exactly_ right then the cost will stay the same, but you get charged for any delta between your prediction and the final output
1. It's priced differently from Claude 3 Haiku. 3.5 Sonnet had the same price as 3 Sonnet, but 3.5 Haiku costs ~4x more than 3 Haiku did 2. No image input support yet
3.5 Haiku beats 3 Opus though, and Opus cost 15x the new Haiku price!
I released a new version of llm-claude-3 adding support for the new model (and fixing an attachments bug):
llm install --upgrade llm-claude-3
llm keys set claude
# paste API key here
llm -m claude-3.5-haiku 'impress me with your wit'github.com/simonw/llm-cla…