Logan Graham Profile picture
Sep 29 11 tweets 4 min read Read on X
Something you may not know about Sonnet 4.5: it’s a special model for cybersecurity.

For the past few months, the Frontier Red Team has been researching how to make models more useful for defenders.

We now think we’re at an inflection point. New post on Red: Image
Sonnet 4.5 is very good on all our cyber tasks. Using some simple approaches, we more than doubled SOTA on some benchmarks. Image
Image
We figured 4.5 would be good at code + agentic stuff. So over the past few months, part of my team has been researching how to make models more useful for defenders. @DARPA's AI Cyber Challenge, and work by companies like @Google and @Xbow, inspired us.

red.anthropic.com/2025/ai-for-cy…
The results from our research:
a/ Sonnet 4.5 is better at cybersecurity tasks
b/ models are already better than people think
c/ I think they could get better fast from here

The question: because cyber is dual use, how do we make sure this advantages defenders?
This is a really hard question. The world needs to answer it ASAP.

We focused on making Sonnet 4.5 better at tasks a lot of defenders do and ones that aren't clearly offensive. But we need to make models even better defensively, and start deploying defenses...
...because we see risks coming. For example, we caught (and disrupted) threat actors using models for "vibe hacking", and when we entered Claude into cyber competitions, sometimes it beat others.

Image
What happens when LLMs write & review 95% of all code? When they win flagship CTF competitions? (e.g. Blue Water @ LiveCTF 2025?) When they find vulns at scale in all open source, and patch them?

Seems like a new world.

I think this is super highly underrated.
We're a team of ML researchers, some of whom with backgrounds as security researchers & operators.

We look at 0-to-1 moments, progress curves, and things that took months of effort 1 year ago being solved in 1 shot today. We see them now.
What to do? Industry should:
+ build evals (hyperrealistic, defensive)
+ build tools for defenders & experiment as fast as possible
+ ask questions about what defense in an era of AGI means
More detail on what we did & found and what we think you should do:

red.anthropic.com/2025/ai-for-cy…
You can now subscribe for updates!

p.s. did you know that all of Red was built w/ Claude Code by our non-technical social scientist team member? Only gets better from here. Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Logan Graham

Logan Graham Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @logangraham

Jan 21
🔥 I'm hiring exceptional research scientists + engineers for the Frontier Red Team at @AnthropicAI.

AGI is a national security issue. We should push models to their limits and get an extra 1-2 year advantage.

Links below.
Hear me talk about what we do: wsj.com/podcasts/the-j…

Read: wsj.com/tech/ai/ai-saf…
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(