Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Cognition

@cognition

Jul 1 • 6 tweets • 3 min read • Read on X

Scrolly

Introducing Devin Security Swarm

A more cost effective and accurate way to find security vulnerabilities in complex codebases, based on a new architecture: Agentic MapReduce.

In testing, Devin Security Swarm found 36 of 50 real-world GHSA vulnerabilities at 30% lower cost per finding than the next most accurate alternative.

We built a new architecture for whole-codebase reasoning that we’re calling Agentic MapReduce.

Security scanning is different from most coding tasks: a report is only trustworthy if the whole codebase is considered. But most agentic systems struggle to scale reasoning across large repos.

Devin maps relevant signals across the repo, fans out focused agents over bounded shards, reduces their findings into one report, then verifies serious vulnerabilities in isolated sandboxes before marking them confirmed.

The result is simultaneously more efficient and more accurate than other tools. We evaluated a variety of security scanning tools on a dataset of 50 GHSA vulnerabilities across 14 languages including Go, Rust, Python, Ruby, Java, C#, JavaScript, C, Swift, Dart, and Elixir. The dataset spans opens source repos of various sizes and of many software categories.

Beyond excelling on our eval, Devin Security Swarm also found critical vulnerabilities that other tools missed, like a PHP sandbox bypass via template injection, an argument injection through metadata value parsing, and an overly broad deserialization surface.

Security Swarm is a new pillar of Devin for Security: a suite of tools to help you find vulnerabilities, validate their exploitability at runtime, and ship remediation PRs.

Learn more and try it today at:

devin.ai/security

We’re also publishing extensive documentation and technical materials about Agentic MapReduce, including a deep-dive on our evals.

Read our announcement: cognition.com/blog/introduci…

Learn about Agentic MapReduce: devin.ai/blog/agentic-m…

Check out the evals: devin.ai/blog/security-…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @cognition

Cognition

@cognition

Jun 8

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers.

Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

20+ world-class open-source developers built realistic coding tasks on repos they maintain. They define what “mergeable” means in their repo.

What does it take to measure mergeability? We use a mix of unit tests, rubrics and novel verifiers to assess correctness, test quality, scope discipline, style, and adherence to codebase standards.

FrontierCode was built in close partnership with the expert maintainers of 36 flagship open-source repositories, like @smilingnosrati, CEO & Tech Lead @CeleryOrg (29k stars), and Martin McKeaveney, CTO of @Budibase (28k stars).

Maintainers invested more than 40 hours per task, undergoing multiple rounds of iteration to ensure that any PR that satisfies these standards would actually be merged.

Read 7 tweets

Cognition

@cognition

Jan 21

Meet Devin Review: a reimagined interface for understanding complex PRs.

Code review tools today don’t actually make it easier to read code. Devin Review builds your comprehension and helps you stop slop.

Try without an account:

More below 👇 devinreview.com

Full breakdown:

First, instead of presenting diffs alphabetically and file-by-file, Devin Review groups related changes together and orders them logically. Each group comes with a clear description of what’s going on. Devin Review also intelligently detects copied and moved code, separating signal from noise.cognition.ai/blog/devin-rev…

Devin Review includes a bug catching agent that labels potential issues by confidence and severity. It will also flag decisions / patterns that could be bad, even if they aren’t bugs, helping you stop slop.

Red: pay attention. Orange: take a look. Gray: FYI

Read 9 tweets

Cognition

@cognition

May 6, 2025

Our research interns present:
Kevin-32B = K(ernel D)evin

It's the first open model trained using RL for writing CUDA kernels. We implemented multi-turn RL using GRPO (based on QwQ-32B) on the KernelBench dataset.

It outperforms top reasoning models (o3 & o4-mini)! 🧵

We train on a subset of 180 PyTorch -> CUDA conversion tasks from KernelBench. It's a nice RL environment because we have immediate code execution feedback.

During training, we give the model 4 refinement steps. In each step, the model proposes a kernel. Then we evaluate correctness & performance and inject the environment feedback in the next step.

For more details on how we made GRPO work in a multi-turn setting read our blogpost (linked below)!

We ablate two different ways of training:
- Single-turn RL (training on just the first step)
- Multi-turn RL (training on four refinement steps)

When evaluated on performance (= speedup of CUDA kernels over PyTorch) we see a significant improvement from multi-turn training.

The model learns how to refine itself more effectively!

(All models are evaluated on 4 & 8 refinement steps, i.e. same amount of compute)

Read 6 tweets

Cognition

@cognition

Apr 25, 2025

Project DeepWiki

Up-to-date documentation you can talk to, for every repo in the world.

Think Deep Research for GitHub – powered by Devin.

It’s free for open-source, no sign-up!
Visit deepwiki com or just swap github → deepwiki on any repo URL:

Go to to explore wikis for the most popular open source repos.

Turn on Deep Research for agent-powered in-depth answers (vid sped up). deepwiki.com

Don't see your repo? We're happy to index any public GitHub repo for you (watch how).

To get wikis for private repos, sign up for a Devin account at . devin.ai

Read 5 tweets

Cognition

@cognition

Dec 11, 2024

Yesterday was Devin’s first day at work! Check out how engineering teams are building with Devin so far.

https://x.com/rahulchhabra07/status/1866593820466614314

https://x.com/rahulchhabra07/status/1866593820466614314

https://x.com/seidtweets/status/1866947729248715188

https://x.com/seidtweets/status/1866947729248715188

Read 10 tweets

Cognition

@cognition

Dec 10, 2024

Devin is generally available today!

Just tag Devin to fix frontend bugs, create first-draft PRs for backlog tasks, make refactors, and more.

Start building with Devin below:

1/5 Devin is built to collaborate with engineering teams and starts at $500/month. Here’s how some of the best teams are using Devin today:

2/5 We worked with Devin to contribute to popular open source repos. Here is one example of a Devin session that triages, solves, and tests a fix for an issue in Anthropic’s MCP: app.devin.ai/sessions/26695…

The merged PR is here: github.com/modelcontextpr…

We’re sharing this session, and several other open source contributions, in our blog below.

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Cognition

Try unrolling a thread yourself!

More from @cognition

Cognition

Cognition

Cognition

Cognition

Cognition

Cognition

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!