Latest Twitter Threads by @francedot on Thread Reader App

Apr 19, 2025 • 10 tweets • 2 min read

We’ve been building quietly. Today, we launch loudly. Meet our startup: Cua AI

Our mission? To commoditize Computer-Use Agents - AI agents that can reason, plan, and act over computer interfaces. Not just research demos, but a practical OSS framework built for AI engineers.

Feb 25, 2024 • 7 tweets • 2 min read

Imagine if language models could tap into the app ecosystem of your iPhone. Would the need for plugins and assistants become obsolete if we simply allowed a model to orchestrate our existing (and many years robust) user interfaces?

This demonstrates the extent to which GPT-4V excels as a Generalist Mobile AI Agent – without any fine-tuning or grounding, and merely by integrating with a text model that has JSON mode enabled. I suggest watching this demo for a (maybe) wow factor and the results on iOS 17 using NavAIGuide, a mobile and web navigational agent framework for LLMs: github.com/francedot/NavA…

Over the last few months, I've been dabbling with using vision models not just in one area, but across web, desktop, and mobile platforms. It's become clear to me that there's a lot of untapped potential in these technologies. The closer we get them to our everyday gadgets, the better we can make use of what they have to offer. This shift could make our connection with AI feel more intuitive and seamless, moving away from a chatgpt-esque interaction with AI assistants.

Share this page!

Enter URL or ID to Unroll