This was way harder than I expected, but I got Hermes Agent running natively on an iPhone (completely offline)
Link below. You can use my Swift package to embed Hermes Agent in your app with a couple lines of code!
This is the real Hermes Agent, totally unmodified. It *should* be App Store safe but who knowsgithub.com/achimala/Swift…
The ugly technical details:
The first challenge is that Hermes Agent is written in Python. iOS, famously, is not. To circumvent this, we simply embed the entire Python interpreter along with the agent codebase, which is fortunately pretty easy these days. We can precompile the Hermes Agent codebase and its core dependencies to bytecode and bundle and interpret it inside the host app.
That's enough to get Hermes to wake up and say hello, but so far all we've got is an overengineered LLM wrapper. To use its features and memory, we need to give it a shell and a persistent filesystem. iOS sure as hell won't let you run shell commands, but iSH (github.com/ish-app/ish) managed to get a Linux shell on iOS by emulating x86 + reimplementing the syscalls in userspace, effectively doing everything inside the app to stay within Apple policies.
This would work, but is pretty slow and unwieldy since it's emulating x86 on ARM. Fortunately, Codex found an arm64 fork (github.com/OpenMinis/ish-…). We embed this too and redirect Hermes' shell calls to it, then remap filesystem commands to Apple's sandbox-friendly app container APIs. Now Hermes has persistent memory!
At this point we can use any external LLM provider with Hermes and get basic functionality. But why stop there? With MLX we can fit ~2B parameter models on device. Qwen-3.5 at 4bit quantization is pretty dumb, but just smart enough to be barely passable as a POC Hermes model that runs entirely offline. I also threw in the on-device Apple Intelligence model, which is terrible (it technically works, but the 4K token context window makes it unusable, and it's generally quite dumb)
Why did I do this? Mostly to see if I could, and because I was making apps and wanted to add agents to them. But I think it's actually a good pattern to have one agent experience that can follow you around across apps. Maybe I can even get it to work with ChatGPT subscriptions?
I'm trying to ship at least one crazy thing a week, so follow if you want to see the next one :)
shoutout @Teknium @NousResearch
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
