Anshu Profile picture
May 16 5 tweets 3 min read Read on X
This was way harder than I expected, but I got Hermes Agent running natively on an iPhone (completely offline)

Link below. You can use my Swift package to embed Hermes Agent in your app with a couple lines of code!


This is the real Hermes Agent, totally unmodified. It *should* be App Store safe but who knowsgithub.com/achimala/Swift…
The ugly technical details:

The first challenge is that Hermes Agent is written in Python. iOS, famously, is not. To circumvent this, we simply embed the entire Python interpreter along with the agent codebase, which is fortunately pretty easy these days. We can precompile the Hermes Agent codebase and its core dependencies to bytecode and bundle and interpret it inside the host app.

That's enough to get Hermes to wake up and say hello, but so far all we've got is an overengineered LLM wrapper. To use its features and memory, we need to give it a shell and a persistent filesystem. iOS sure as hell won't let you run shell commands, but iSH (github.com/ish-app/ish) managed to get a Linux shell on iOS by emulating x86 + reimplementing the syscalls in userspace, effectively doing everything inside the app to stay within Apple policies.

This would work, but is pretty slow and unwieldy since it's emulating x86 on ARM. Fortunately, Codex found an arm64 fork (github.com/OpenMinis/ish-…). We embed this too and redirect Hermes' shell calls to it, then remap filesystem commands to Apple's sandbox-friendly app container APIs. Now Hermes has persistent memory!

At this point we can use any external LLM provider with Hermes and get basic functionality. But why stop there? With MLX we can fit ~2B parameter models on device. Qwen-3.5 at 4bit quantization is pretty dumb, but just smart enough to be barely passable as a POC Hermes model that runs entirely offline. I also threw in the on-device Apple Intelligence model, which is terrible (it technically works, but the 4K token context window makes it unusable, and it's generally quite dumb)
Why did I do this? Mostly to see if I could, and because I was making apps and wanted to add agents to them. But I think it's actually a good pattern to have one agent experience that can follow you around across apps. Maybe I can even get it to work with ChatGPT subscriptions?
I'm trying to ship at least one crazy thing a week, so follow if you want to see the next one :)

shoutout @Teknium @NousResearch

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Anshu

Anshu Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(