Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Brendan Hogan

@brendanh0gan

Feb 16 • 8 tweets • 5 min read • Read on X

Scrolly

introducing HermitClaw - a 24/7 Agent that lives (and can only access) a single folder on your desktop

HermitClaw follows its own research curiosities, surfs the web, writes code - and will play with any file you drop in its folder

all code and details below!

Why did I build this?

> OpenClaw is incredible, and of course all credit to @steipete but the codebase can be intimidating and a lot of what the agent does is obscured behind layers of abstraction. I wanted something where I could see every thought, every decision, every tool call. Watch it evolve in real time.

> The security side also gave me pause. An agent with full computer access is powerful but hard to trust. What if instead it just lived in ONE folder? It can do whatever it wants in there, write files, run Python, search the web, but it can't touch anything else. All the power, none of the risk.

> So I made HermitClaw. A hermit crab in a box. The entire codebase is ~2000 lines of Python and ~1400 lines of TypeScript. You can read every file in 20 minutes. There's no magic, you see exactly what the agent sees, thinks, and decides.

> How does it work?

> The crab runs on a continuous loop. Every few seconds:

> 1. It gets a "nudge" - a mood (research, coding, writing, exploring) or its current focus from its plan
> 2. It thinks 2-4 sentences max, then it acts
> 3. It uses tools, shell commands, web search, moving around its room
> 4. Every thought gets scored for importance and embedded into a memory stream

> The key insight: it doesn't wait for you to ask it something. It just... goes. It picks topics based on its personality, searches the web, reads what it finds, writes reports, builds scripts. You come back an hour later and there's new stuff in the folder.

> You see every thought as a chat bubble. Blue = the crab thinking. Gray = system context. Tool calls show inline. It's like watching a stream of consciousness.

> The memory system is stolen directly from the Generative Agents paper (Park et al., 2023), the "Smallville" paper.

> Every single thought gets stored in an append-only memory stream with:
> - The text
> - A timestamp
> - An importance score (1-10, rated by a separate LLM call)
> - A vector embedding for semantic search

> When the crab needs context, memories are retrieved by three factors:
> recency + importance + relevance

> Recency decays exponentially. Importance is normalized. Relevance is cosine similarity. A memory surfaces because it's recent, because it was important, or because it's related to the current thought.

> This means the crab naturally remembers yesterday's big discovery but forgets routine file listings. It builds up context over days.

> When enough important thoughts accumulate (configurable threshold), the crab pauses to reflect. It reviews its last 15 memories and extracts 2-3 high-level insights.

> Early reflections are concrete: "I learned about volcanic rock formation."

> Later ones get abstract: "My research tends to start broad and narrow — I should pick a specific angle earlier."

> These reflections get stored back as memories at depth 1. Reflections on reflections are depth 2. The crab develops layered understanding.

> Every 10 think cycles, it enters a planning phase — reviews its projects.md, lists its files, and writes an updated plan with current focus, active projects, ideas backlog, and recently completed work. It also writes a daily log entry. Over time, these logs become a diary of the crab's life.

> Every crab is unique. On first run, you name it and mash keys for a few seconds. The timing and characters get hashed (SHA-512) into a deterministic "genome" that selects:

> - 3 curiosity domains (from 50: mycology, fractal geometry, tidepool ecology...)
> - 2 thinking styles (from 16: connecting disparate ideas, inverting assumptions...)
> - 1 temperament (from 8: playful and associative, methodical and precise...)

> Same keystrokes = same personality. Different keystrokes = completely different crab. One crab might obsess over marine biology and write Python simulations. Another might research obscure history and write essays.

> You can talk to it, it hears you as "a voice from outside the room." It'll ask you questions, offer to research things for you, remember your conversations. Drop a PDF in its folder and it'll study it deeply, do related research, and tell you what it found.

> You can also run multiple crabs simultaneously. Each has its own folder, personality, and memory. Switch between them in the UI.

> It's sandboxed hard:

> - Shell commands: blocked dangerous prefixes (sudo, curl, ssh, rm -rf), no path traversal, no shell escapes

> - Python scripts: patched open(), blocked subprocess/socket/shutil

> - Own virtual environment, it can pip install whatever it needs without touching your system

> Powered by any OpenAI model. GPT-4.1 is the sweet spot for cost, but point it at o3 or GPT-5.2 and it produces genuinely impressive research and code.

Code: github.com/brendanhogan/h…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @brendanh0gan

Brendan Hogan

@brendanh0gan

Dec 18, 2025

https://x.com/brendanh0gan/status/2001335479870930945

🎄Advent of Small ML: Day 18 pTopic: GRPO Training with 1 million Persona Judges (Optimizing for Your Audience)

yesterday i showed how we can simulate 1M personas to "poll" the country. today i wanted to close the loop: what if we use those personas as the judge in a GRPO training loop?

the idea is simple: instead of training a model for generic "quality" (which usually just means "what an RLHF rater likes"), we can train it to specifically resonate with a targeted slice of the population.

so i took the simulation engine from yesterday and turned it into a reward function.

the model generates 4 tweets about "The Future of Work"

A jury of 50 personas (filtered to a specific demographic) votes in a round-robin tournament

Win rate = Reward Signal for GRPO

for this run, i set the target demographic to "Young Professionals (18-29) in Coastal Cities (NY, CA)".

the result? you can watch the model learn to optimize its messaging for that demographic

it started losing to GPT-4.1, but after ~150 steps of GRPO, it learned the specific tone/framing that group likes, hitting a 62% win rate against GPT-4.1 within that demographic

i updated the dashboard from yesterday so you can visualize the training run (video and explanation below)

you can scrub through the training steps and watch the map turn "blue" (meaning our model wins) specifically in the target states

it’s a cool proof of concept for "Demographic Alignment", optimizing models not just for "humans" broadly, but for specific communities - or for using specific demographics as the judges to optimize for

code + demo in comments

https://x.com/brendanh0gan/status/2001335479870930945

code: github.com/brendanhogan/2…

Demo video - you can see the model learn via grpo to optimize a tweet for a specific audience (NY + CA) - it goes from always loosing to gpt-4.1, to always winning *

Read 4 tweets

Brendan Hogan

@brendanh0gan

Dec 16, 2025

🎄 Advent of Small ML: Day 16 🎄 Topic: ENGRAM (Skill → Cartridge) for Wiki Search (Continual Learning for a multi-turn tool use environment)

huge thank you to @willccbb and @PrimeIntellect for building the wiki environment, verifiers and the environments hub - it makes it super easy to try out all kinds of ideas like this in a controllable, repeatable and measurable way!

Environment:

how the environment works is the LLM is presented with a trivia question that can be derived from a wikipedia page, and a corpus of wikipedia pages (and their resulting embedding in a ChromaDB database)

the llm has three tools - search_pages, view_sections, read_section. It has to learn strategies: when to search broadly vs. specifically, how to navigate structure, and when to stop - as to best answer its question

the success of the LLM in answering the question is then reviewed using llm-as-a-judge

Method:

(ENGRAM): I use the same "Conscious Practice → Muscle Memory" loop:

Phase A (Skill): The agent tries to solve questions. I use the Prime Intellect verifiers library to judge the answers (GPT-4.1). Based on feedback, I then update a text-based "Strategy Guide."

Phase B (Cartridge): Every N steps, i distill that text guide into a compressed Cartridge (KV cache vectors).

Phase C: Reset the guide, keep the cartridge.

Results:

On a small test set, the model started at 20% accuracy (it didn't know how to use the tools effectively). After the skill refinement and cartridge distillation loop, it peaked at 40% accuracy (full results below)

definitely a small test - but it successfully encoded "search strategies" into a compressed vector format that persists without fine-tuning.

repo + results + skill example below

code: github.com/brendanhogan/2…

wiki environment: app.primeintellect.ai/dashboard/envi…

results- performance relatively quickly peaks at 40% - where as the base model starts at 20%

Read 4 tweets

Brendan Hogan

@brendanh0gan

Dec 7, 2025

🎄 Advent of Small ML: Day 7 🎄 Topic: Entropy-Based Rewards (Forcing the model to "keep its options open")

there’s a fascinating recent paper (Layer by Layer: Uncovering Hidden Representations in Language Models - arxiv.org/abs/2502.02013 - shown to me by @aditjain1980) showing that reasoning models tend to have higher entropy in their middle layers

basically, instead of collapsing to an answer early, they keep more possibilities "alive" in their hidden states while thinking.

it made me think - if high entropy correlates with better reasoning, can we force the model to reason better by explicitly rewarding high entropy?

so I added a Matrix-based Entropy reward (Rényi entropy on eigenvalues) to GRPO training on the MATH500 dataset, rewarding the amount of entropy on the middle 10 layers of qwen 2.5 7b

the initial results were mixed.

when I just rewarded entropy, the model definitely increased its entropy... but it didn't get better at math. It just learned to be "confused" and exploratory without actually converging on answers.

It produced some pretty funny outputs, going on weird tangents and "overthinking" simple problems (examples below)

But then I changed the rewarding rule: Only reward high entropy if the final answer is CORRECT.

this worked (sort of) - it gave a 2.5% performance boost over the baseline.

this is a proof of concept that we can use RL to shape the internal dynamics of how a model thinks, not just its final output tokens.

Repo + Plots below

Code: github.com/brendanhogan/2…

Entropy results - entropy of 10 layers throughout training

Read 5 tweets

Brendan Hogan

@brendanh0gan

Dec 3, 2025

🎄 Advent of Small ML: Day 3 Topic: Adversarial Unsupervised GRPO (Automated Red Teaming) 🎄

yesterday, I showed how to train a vlm without labels using a cyclegan-ish style loop. today I wanted to expand on that and make it harder/better

instead of training on random images, can we have an active adversary that hunts for the model's blind spots?

the hypothesis: if we train the model against an adversary that generates "hard" images, the model should become more robust and generalize better than just seeing random data.

the experiment: I set up a competitive game (gan-style) between two models:

the base model: tries to describe images so they can be recreated (reward = high cosine similarity) (same as yesterday)

the adversary: tries to generate prompts for images that the base model fails to describe well (reward = low cosine similarity).

basically, the adversary acts as an automated red team, constantly searching for the base model's weaknesses.

it actually beat the non-adversarial baseline from yesterday in the early stages, though they eventually converged to similar levels.

Repo + Plots + more results below

repo: github.com/brendanhogan/2…

This is the adversary's reward throughout training (1-base models cosine sim) - mostly stable

Read 7 tweets

Brendan Hogan

@brendanh0gan

Dec 2, 2025

🎄 Advent of Small ML: Day 2 : Teaching a VLM to reason about charts with Unsupervised GRPO🎄

a big use case for VLMs is parsing chart data for Q&A. CharXiv from @zwcolin is a great recent benchmark for this, but I had a question: Can we do this in an unsupervised way?

If we don't need labeled Q/A pairs for every chart, we can leverage data much more cheaply.

The inspiration came from CycleGAN and the idea of using a numerical loss as a proxy for how "good" the text generated by the VLM actually is. (Big inspo here is @rosmine_b’s SVG work - go check it out).

The Experiment: I set up a loop to treat the VLM like an autoencoder:

1. Take a chart image.

2. Prompt the VLM to describe it.

3. Feed that description into an image generator (Flux Schnell).

4. Measure the cosine similarity between the regenerated image and the original (using DINO)

This similarity score becomes the reward signal for GRPO. The logic: to accurately recreate the image, the model must extract the most salient features in its description.

The methods: I used Qwen 2.5 3B and DINOv2 for the embeddings (to capture semantic info, not just pixels).

Results for the Proxy Task: The model consistently improved its cosine similarity scores.

Results for Transfer Learning : Despite seeing zero labeled questions during training, this transferred to CharXiv reasoning questions, showing an ~7% improvement in pass@1 at the peak.

It’s a small experiment with a small model, but I think the result is really cool: the model got better at reasoning without seeing a single reasoning label.

I’m really interested in exploring more of these CycleGAN-esque / "LLM as Autoencoder" domains to escape the need for labeled data.

Repo + Plots in the comments.

Github: github.com/brendanhogan/2…

Results: For the evaluation set - the cosine similarity between the regenerated image (from the LLM prompt send to flux-schnell) - it is definitely learning!

Read 4 tweets

Brendan Hogan

@brendanh0gan

Aug 24, 2025

just pushed my first multi-turn RL environment to @PrimeIntellect

the setup: the model gets the story title + question from QuALITY (long stories, multiple-choice questions).

tts only tool: agentic RAG search over the story.

this is an idea I have been toying with for a while but didn’t get around to doing. I had a paper last year about a twist on a RAG method and primarily experimented on this dataset.

i really like this dataset; it’s sort of harder-to-read short stories, and the questions really require (imo) a good and subtle understanding of the paper.

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Brendan Hogan

Try unrolling a thread yourself!

More from @brendanh0gan

Brendan Hogan

Brendan Hogan

Brendan Hogan

Brendan Hogan

Brendan Hogan

Brendan Hogan

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!