Simon Willison Profile picture
Creator @datasetteproj, co-creator Django. PSF board. @nichemuseums. Hangs out with @natbat + @cleopaws. He/Him. Mastodon: https://t.co/t0MrmnJW0K
Flavio Curella Profile picture Maleph Profile picture Jin Ho Hur Profile picture Rodolpho Gurgel Profile picture Jerome Ku Profile picture 13 subscribed
Sep 25 4 tweets 2 min read
Llama 3.2 is out, and it's a much more substantial release than the 3.1 to 3.2 version bump might indicate

Four new models, including Meta's first two vision models (11B and 90B) and two new text-only small models (1B and 3B)

My notes so far: simonwillison.net/2024/Sep/25/ll… You can try out the vision models on by navigating to "Direct Chat" and then selecting those models from the dropdown menu lmarena.ai
llama-3.2-vision-90b-instruct  A photo of two capybaras  Prompt: Describe this photo in detail  The image depicts two capybaras standing in a body of water, with one positioned on the edge of a metal grate. The capybara on the right is partially submerged in the water, while the other is standing on the grate.  Key Features:      Capybaras: Two capybaras are visible in the image.     Water: The capybaras are standing in a body of water, which appears to be murky or dirty.     Metal Grate: A metal grate is visible at the bottom right of the image, with one capybara standing on it.     Backgr...
Sep 4 5 tweets 2 min read
OK, the Apache 2 licensed Qwen2-VL-7B Instruct from Alibaba looks VERY impressive

Anyone figured out a recipe to get it working on a Mac yet? qwenlm.github.io/blog/qwen2-vl/
I tried the Hugging Face demo and got a very impressive result for handwriting recognition
Aug 5 4 tweets 2 min read
Not surprising to see NVIDIA doing this - practically the industry standard right now - but interesting to see details of what they're collecting and why:

"Movies are actually a good source of data to get gaming-like 3D consistency and fictional content but much higher quality" More on my blog:

My intuition is the backlash against scraped video data will be even more intense than for static images in image models. Video is generally more expensive to create, and video creators (such as MKBHD) have a lot of influence.simonwillison.net/2024/Aug/5/nvi…
Jul 8 5 tweets 2 min read
I needed a box shadow the other day, so I got Claude to build me a custom one-off interface for fiddling with box shadow properties

Pretty wild how we can build tiny custom tools like this in a similar time that it would take to search for and select a tool that already exists! Claude prompt: CSS for a slight box shadow, build me a tool that helps me twiddle settings and preview them and copy and paste out the CSS  Response:  Certainly! I'll create a React component that allows you to adjust box shadow settings and preview the result in real-time. This tool will also provide the CSS code for you to copy and paste.  I've created a React component that allows you to adjust various box shadow settings and see the results in real-time. Here's a breakdown of the features:  Preview: A box at the top shows you how the shadow looks with your current settings.  Sliders for... Related: Maggie Appleton's excellent recent talk about home-cooked software
Jul 2 5 tweets 2 min read
Anyone figured out how to run Gemini Nano in Google Chrome Canary?

I turned on the feature flag for it but it doesn't seem to have downloaded the model file - the "await window['ai'].createTextSession();" API returns an error "InvalidStateError: The session cannot be created" Screenshot of chrome://flags showing that I have enabled the "Prompt API for Gemini Nano" experiment Here's the documentation I was missing
Apr 25 9 tweets 3 min read
"Do stuff and then blog about it" remains one of the most underrated pieces of career advice My personal policy is that the cost I have to pay for being distracted by a fun new project is that I have to write about it

Never take on a project without also writing about it: so much value is lost if you don't give the world a fighting chance of understanding what you made!
Apr 14 4 tweets 1 min read
Have any of the large scale LLM training organizations - Anthropic, OpenAI, Gemini, Mistral, the Llama team - published anything notable about this idea of "model collapse" yet - the worry that LLM quality will drop as their training data becomes pollinated by model output? I meant "polluted" not "pollinated" there
Mar 31 6 tweets 2 min read
What embedding models are there with separation between different modes of content?

I know E5-Large-V2 has that ("passage" vs "query"), and @nomic_ai have "search_query", "search_document", "clustering", "classification"

Any other good examples?docs.nomic.ai/reference/endp… Being able to embed your content as "passage" but questions people ask about it as "query" is useful for implementing RAG - a user's question might not naturally embed to a similar location as content that answers that question, this trick helps fix that til.simonwillison.net/llms/embed-par…
Mar 26 7 tweets 3 min read
TIL about binary vector search... apparently there's a trick where you can take an embedding vector like [0.0051, 0.017, -0.0186, -0.0185...] and turn that into a binary vector just reflecting if each value is > 0 - so [1, 1, -1, -1, ...] and still get useful cosine similarities! ... which is absurdly useful, because it means you can store those vectors in 1/32th of the original space, and run MUCH faster comparisons against them

Then you can brute-force to get the 100 top matches, then run the more expensive full floating point ops against that subset
Mar 25 6 tweets 2 min read
I'm really confused about the new @MistralAI model

There's no official announcement about it anywhere that I've found

It's apparently a 0.2 of their 7B model, but they've had other things called 0.2 previously?

Then there's this commit - is this it? huggingface.co/mistralai/Mist…
A diff to the README for  # Model Card for Mistral-7B-Instruct-v0.2  Text has been added saying:   The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.  Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1 - 32k context window (vs 8k context in v0.1) - Rope-theta = 1e6 - No Sliding-Window Attention ollama have it as a "text completion" model, which sounds different from the instruct model I linked to above:
Mar 23 7 tweets 3 min read
Just got ChatGPT Code Interpreter to write me a SQLite extension in C from scratch, then compile it with GCC, then load it into Python and test it out, then iterate on it to fix the bugs

All on my phone while pottering around the house Here's the full transcript - it got stuck in a bit of an error loop towards the end of this first one:

So I pasted what it had done so far into a fresh session and got it working there: chat.openai.com/share/a52ee3e4…
chat.openai.com/share/29bd1d15…
Aug 3, 2023 5 tweets 2 min read
The video for my North Bay Python talk is out, and I've put together an accompanying edited transcript with annotated slides and links



If you haven't been completely immersed in this world for the last year, my hope is this can help catch you up!simonwillison.net/2023/Aug/3/wei… Here's the table of contents      What they are     How they work     A brief timeline     What are the really good ones     Tips for using them     Using them for code     What can we build with them?         Giving them access to tools         Retrieval augmented generation         Embeddings and vector search     ChatGPT Plugins     ChatGPT Code Interpreter     How they’re trained         LLaMA training data         Reinforcement learning from human feedback     Openly licensed models     My LLM utility     Prompt injection
Jul 12, 2023 13 tweets 5 min read
Huge new release of my LLM CLI tool (and Python library) for accessing Large Language Models: it now supports additional models via plugins, so you can "llm install llm-gpt4all" to get models that run on your own machine!
simonwillison.net/2023/Jul/12/ll… I want to make local models (and remote API-driven models) as easy to try out as possible, so I put together this detailed tutorial about how to build an LLM plugin that adds support for a new model: llm.datasette.io/en/stable/plug…
Jun 13, 2023 4 tweets 2 min read
A whole bunch of big API announcements from @OpenAI today - including an API-level implementation of the ReAct pattern where you can provide custom function calls that can be triggered by the language models openai.com/blog/function-… The new gpt-3.5-turbo-16k model is a big deal: 16,000 token context (4x the previous limit for 3.5) charged at $0.003 per 1K input tokens and $0.004 per 1K output tokens, which is still a whole lot cheaper than GPT-4 (more than 10x that price for 8k, more than 20x that for 32k)
Jun 8, 2023 5 tweets 2 min read
Understanding GPT tokenizers: I wrote about how the tokenizers used by the various GPT models actually work, including an interactive tool for experimenting with their output simonwillison.net/2023/Jun/8/gpt… The tokenizers have a strong bias towards English: "The dog eats the apples" is 5 tokens, "El perro come las manzanas" is 8 tokens, and many Japanese characters end up using two integer tokens for each character of text.
Apr 13, 2023 4 tweets 2 min read
"What happens when time to learn a new technology is no longer a constraint for building something new?"

I've been thinking about that a lot. I'm no longer nearly as resistant to building something in languages I have little existing knowledge of, which is absolutely fascinating One of my most effective instincts for productivity has always been picking projects that are the best possible fit for my existing knowledge and experience - that way I can make the fastest possible progress on the challenges that are new and unique to me for that project
Apr 12, 2023 5 tweets 3 min read
Published some detailed notes on a thing I've been doing with the ChatGPT "Code Interpreter" alpha - a new mode where ChatGPT can generate and run Python scripts in a sandboxed interpreter

I've been using it to run micro-benchmarks and plot the results:

simonwillison.net/2023/Apr/12/co… There was a bug in the first version of the code it wrote, and it returned an error... so ChatGPT deciphered the error, updated the code and ran it a second time without me even having to intervene
Apr 11, 2023 6 tweets 2 min read
Your regular reminder that prompt injection still does not have a known robust fix, so if you're building any software on top of ChatGPT/GPT-3/4/LLMs you need to assume you are vulnerable to the attack, and that things like your "system prompt" are effectively public information More notes here:
Apr 8, 2023 4 tweets 1 min read
Posted a few notes and an extract from my appearance on today's episode of @changelog, where we talked about all things Large Language Model: ChatGPT, GPT-4, Bing, Bard, Claude, LLaMA and more
simonwillison.net/2023/Apr/8/llm… We talked about how this would hopefully be possible within a year in the @changelog episode - I really need to stop underestimating how fast things will happen!
Mar 25, 2023 4 tweets 1 min read
If you're just starting to learn software engineering right now but you're considering dropping it because you think the field might be made obsolete by AI, I have an alternative approach to suggest for you:

Start learning now, and use AI tools to learn FASTER I'm confident our field will be around for a long time to come, but I'm also confident it's going to change a whole lot over the next few years

My suspicion is that now might be a fantastic time to start learning, because a lot of the more frustrating aspects will be smoother
Mar 24, 2023 6 tweets 2 min read
I built a ChatGPT plugin to answer questions about data hosted in Datasette
simonwillison.net/2023/Mar/24/da… ... and here's the bad news. It can hallucinate, inventing data that's entirely independent of the data that came back from the SQL query!

I have an open issue about that here, including some examples: github.com/simonw/dataset…