You can try out the vision models on by navigating to "Direct Chat" and then selecting those models from the dropdown menu lmarena.ai
OK, my first impressions of the tiny Llama 3.2 1B model - running locally via Ollama, accessed via my LLM tool - are it's incredibly capable for a model that size & really fast
Not surprising to see NVIDIA doing this - practically the industry standard right now - but interesting to see details of what they're collecting and why:
"Movies are actually a good source of data to get gaming-like 3D consistency and fictional content but much higher quality"
My intuition is the backlash against scraped video data will be even more intense than for static images in image models. Video is generally more expensive to create, and video creators (such as MKBHD) have a lot of influence.simonwillison.net/2024/Aug/5/nvi…
A few weeks ago there was a big response to a story about companies training just on captions scraped from YouTube - captions only, not the video. This NVIDIA story involves the full video content.
Anyone figured out how to run Gemini Nano in Google Chrome Canary?
I turned on the feature flag for it but it doesn't seem to have downloaded the model file - the "await window['ai'].createTextSession();" API returns an error "InvalidStateError: The session cannot be created"
I turned on the chrome://flags "Prompt API for Gemini Nano" experiment and left my laptop on overnight and by the morning it had downloaded the ~1.9GB model file and the new prompt API started working!
My personal policy is that the cost I have to pay for being distracted by a fun new project is that I have to write about it
Never take on a project without also writing about it: so much value is lost if you don't give the world a fighting chance of understanding what you made!
I mark all of my posts that fit this ideal with the "projects" tag - it's got up to 372 posts now: simonwillison.net/tags/projects/