Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)
Aug 22 • 4 tweets • 2 min read
Okay this is insane... WebGPU-accelerated semantic video tracking, powered by DINOv3 and Transformers.js! 🤯
This will revolutionize AI-powered video editors... which can now run 100% locally in your browser, no server inference required (costs $0)! 😍
Who's building this?
How does it work? 🤔
1️⃣ Generate and cache image features for each frame
2️⃣ Create a list of embeddings for selected patch(es)
3️⃣ Compute cosine similarity between each patch and the selected patch(es)
4️⃣ Highlight those whose score is above some threshold
... et voilà! 🥳
Nov 21, 2023 • 5 tweets • 2 min read
Transformers.js v2.9.0 is now out! 😍 New features:
🎯 Zero-shot Object Detection w/ OwlViT
🕵️♂️ Depth Estimation w/ DPT and GLPN
📝 Optical Document Understanding w/ Nougat
... and you can get started in just a few lines of code! 🤯👇 1. Zero-shot Object Detection is the task of identifying objects of classes that are unseen during training.
This means you can specify a list of words/phrases at runtime, and the model will generate bounding boxes for any occurrences it finds!