GPT-4o is slower than Flash, more expensive, chatty, and very stubborn (it doesn't like to stick to my prompts).
Next week, I'll post a step-by-step video on how to build this.
The first request takes longer (warming up), but things work faster from that point.
Few opportunities to improve this:
1. Stream answers from the model (instead of waiting for the full answer.)
2. Add the ability to interrupt the assistant.
3. Whisper running on GPU
Unfortunately, no local modal supports text+images (as far as I know,) so I'm stuck running online models.
The TTS API (synthesizing text to audio) can also be replaced by a local version. I tried, but the available voices suck (too robotic), so I kept OpenAI's.
I’m so sorry about anyone who bought the rabbit r1.
It’s not just that the product is non-functional (as we learned from all the reviews), the real problem is that the whole thing seems to be a lie.
None of what they pitched exists or functions the way they said.
They sold the world on a Large Action Model (LAM), an intelligent AI model that would understand applications and execute the actions requested by the user.
In reality, they are using Playwright, a web automation tool.
No AI. Just dumb, click-around, hard-coded scripts.
Their foundational AI model is just ChatGPT + scripts.
Rabbit’s founder lied on their marketing videos, during interviews, when he presented the product, and lied on Discord when answering questions from early supporters.
1. Mojo 🔥 went open-source 2. Claude 3 beats GPT-4 3. $100B supercomputer from MSFT and OpenAI 4. Andrew Ng and Harrison Chase discussed AI Agents 5. Karpathy talked about the future of AI
...
And more.
Here is everything that will keep you up at night:
Mojo 🔥, the programming language that turns Python into a beast, went open-source.
This is a huge step and great news for the Python and AI communities!
With Mojo 🔥 you can write Python code or scale all the way down to metal code. It's fast!