I also enjoyed this piece about how @philiplord drove the decision not to include subtitles for the Spanish language dialog to better represent bilingual culture remezcla.com/features/film/…
Here's a tweet from just before the movie came out with a preview of one of the early scenes - but I just noticed that the conversation attached to this tweet has a ton of extra insight from the animator on how he put the scene together
Wrote up some notes on the new Qwen2.5-Coder-32B model, which is the first model I've run on my own Mac (64GB M2) that appears to be highly competent at writing code simonwillison.net/2024/Nov/12/qw…
So far I've run Qwen2.5-Coder-32B successfully in two different ways: once via Ollama (and the llm-ollama plugin) and once using Apple's MLX framework and mlx-llm - details on how I ran both of those are in my article.
If you use uv on a Mac with 64GB of RAM try this
uv run --with mlx-lm \
mlx_lm.generate \
--model mlx-community/Qwen2.5-Coder-32B-Instruct-8bit \
--max-tokens 4000 \
--prompt 'write me a python function that renders a mandelbrot fractal as wide as the current terminal'
I deleted my earlier tweet about this because I misunderstood it - this is an interesting new feature for speeding up prompt inference at the expense of paying for additional tokens
Confirmation here that if you get your predicted completion _exactly_ right then the cost will stay the same, but you get charged for any delta between your prediction and the final output
1. It's priced differently from Claude 3 Haiku. 3.5 Sonnet had the same price as 3 Sonnet, but 3.5 Haiku costs ~4x more than 3 Haiku did 2. No image input support yet
3.5 Haiku beats 3 Opus though, and Opus cost 15x the new Haiku price!
I released a new version of llm-claude-3 adding support for the new model (and fixing an attachments bug):
llm install --upgrade llm-claude-3
llm keys set claude
# paste API key here
llm -m claude-3.5-haiku 'impress me with your wit'github.com/simonw/llm-cla…
I was having a conversation with Claude about unconventional things to do in the SF Bay Area and I got a bit suspicious so I prompted "Are you sure all of those are real? I think you made some of those up."
(I've actually been to the Gregangelo Museum and can confirm it definitely does exist: )niche-museums.com/14
I haven't visited it yet, but the Museum of International Propaganda definitely exists as well maps.app.goo.gl/x1TV32h2D3rvQk…
I added multi-modal (image, audio, video) support to my LLM command-line tool and Python library, so now you can use it to run all sorts of content through LLMs such as GPT-4o, Claude and Google Gemini
llm 'transcript' \
-a 'https:/''/static.simonwillison.net/static/2024/video-scraping-pelicans.mp3' \
-m gemini-1.5-flash-8b-latest
Cost to transcribe 7m of audio with Gemini 1.5 Flash 8B? 1/10th of a cent.
I still think most people are sleeping on these multi-modal vision/audio LLMs - extracting useful information from non-text media sources used to be almost impossible, now it can be done effectively for fractions of a cent