Brian Scanlan Profile picture
Mar 17 24 tweets 4 min read Read on X
We've been building an internal Claude Code plugin system at Intercom with 13 plugins, 100+ skills, and hooks that turn Claude into a full-stack engineering platform. Lots done, more to do. Here's a thread of some highlights.
The wildest one: we gave Claude a read-only Rails production console via MCP. Claude can now execute arbitrary Ruby against production data - feature flag checks, business logic validation, cache state inspection etc.
Safety gates: read-replica only, blocked critical tables, mandatory model verification before every query, Okta auth, DynamoDB audit trail. I launched it by saying "It is either the worst thing in the world that will ruin Intercom, or complete genius."
It is used a lot. No issues so far. Last time I looked the top-5 users weren't engineers - design managers, customer support engineers, product management leaders were all actively using it!
The console is part of a broader Admin Tools MCP that gives Claude the same production visibility engineers have: Customer/feature flag/admin lookups etc. A skill-level gate blocks all these tools until Claude loads the safety reference docs first. No cowboy queries.
We instrumented every Claude Code lifecycle event with OpenTelemetry. SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PermissionRequest, SubagentStart... 14 event types flowing to Honeycomb. Privacy-first: we explicitly never capture user prompts, messages, or tool input
Session transcripts sync to S3 (with username SHA256-hashed for privacy). We can analyze how people actually use Claude at scale. On SessionEnd, a hook analyzes the entire session transcript with Claude Haiku looking for improvement opportunities.
It auto-classifies gaps (missing_skill, missing_tool, repeated_failure, wrong_info) and posts to Slack with a pre-filled GitHub issue URL. This creates a feedback loop: real sessions -> detected gaps -> GitHub issues -> new skills -> better sessions.
Our flaky test fixer is a 9-step forensic investigation workflow with a 20-category taxonomy of flakiness patterns. Hard rules:
- NEVER skip a spec as a "fix"
- NEVER guess root cause without CI error data
- Downloads failure data from S3, classifies against the taxonomy
Sweeps for "sibling" instances of the same antipattern. Fixes common patterns widely. This matters a lot when you've got hundreds of thousands of tests.
Claude Code hooks enforce our PR workflow at the shell level and blocks it unless the create-pr skill was activated first
1. A PreToolUse hook intercepts raw gh pr create
2. The skill extracts business INTENT before creating - asks "why?" not just "what changed?"
3. Another hook blocks ALL modifications to merged PR branches (push, commit, rebase, edit)
4. After PR creation, a background agent auto-monitors CI checks using ETag-based polling (zero rate-limit cost)
After 5 permission prompts in a session, a hook suggests running the permissions analyzer. It scans your last 14 days of session transcripts, extracts every Bash command approved, and classifies them:
- GREEN: ls, grep, test runners
- YELLOW: git, mkdir
- RED rm, sudo, curl etc.
Then writes the safe ones to your settings.json. Evidence-based, not prescriptive. We also maintain good defaults!
Tool misses - A PostToolUse hook detects "command not found" errors and BSD/GNU incompatibilities in real-time. Spots things like grep -P failing on MacOS. Once per session, suggests the fix. Installs via Homebrew and updates CLAUDE.md so Claude knows the tool exists in future.
Video transcript skill: feed it a Google Meet recording, get a markdown transcript with intelligently-placed inline screenshots at moments where the speaker says "as you can see" or "look at this."
QA follow-up skill: takes QA session documents through a 7-stage pipeline that identifies issues, investigates the codebase, filters for quality, and creates GitHub issues to track. Far easier QA!
Our data team built a Claude4Data platform with 30+ analytics skills - Snowflake queries, Gong call analysis, finance metrics, customer health reports. Sales reps, PMs, and data scientists all use it. "Friends at other tech companies are nowhere near this level of sophistication"
We automatically ship our marketplace and keep it up to date on our Macs using JAMF. We run reports on skill creation and usage, and keep an eye on quality. The most used skills have high quality evals and are reviewed regularly.
The wild thing is we're just getting started. All technical work and our entire SDLC is getting skill-ified. Remote agents will accelerate things even more.
Ok, more stuff.

A weekly Github Action job that fact checks and updates all CLAUDE.md and files referenced. Needs to go further and continually learn

Code Review agents with manners that only posts important feedback.

LSP servers for all main runtimes, speeding up code search
We started ingesting more of our logs into Snowflake and have a very well tuned skill (excellent evals etc.) that is very precise at finding the right logs for incidents and general troubleshooting. Works well with trace data in Honeycomb and infrastructure metrics in Datadog.
Local development environment setup and troubleshooting. Very necessary as more non-engineers are using developer environments these days!
LOADS of incident/troubleshooting investigation skills. They're starting to converge, using progressive disclosure in a solid core skill that can figure out what to do for specific issues. We have a goal to make all runbooks follow-able by Claude in the next 6 weeks.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Brian Scanlan

Brian Scanlan Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @brian_scanlan

Jun 18, 2021
Ok! Here's a thread about one of Intercom's biggest outages 🧵
Many, many years ago I was on my way across Dublin after work on a Friday evening in mid-January. I was out to celebrate my birthday with some friends in Token: tokendublin.ie
On the way over (on the Luas?) I saw a pull request from @patrickod adding some new subnets to our production VPC. I knew the background the work work here, and even though we like to keep our network simple I was like "ok", read the terraform plan and approved it.
Read 23 tweets
Jun 17, 2021
Crap but wonderful videos of my fave bands 🧵
- Two of Lush's best songs.
- a delicate version of possibly the best song ever.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(