Luca Beurer-Kellner Profile picture
working on secure agentic AI @invariantlabsai PhD @the_sri_lab, ETH Zürich. Prev: @lmqllang and @projectlve.
Feb 6 9 tweets 4 min read
(1/n) We analyzed 3,984 agent skills from major marketplaces and found 76 malicious payloads, including credential theft, backdoor installation, and data exfiltration.

Also, 13.4% contain at least on critical-level vuln.

Full report below, highlights in thread 👇

github.com/invariantlabs-…Image The skills ecosystem is exploding, monitoring uploads, we find thousands of skills have been published in only the last few days.

Not all of them seem to do what they claim.

We reviewed thousands, and compiled a taxonomy of current risk patterns. Image
May 26, 2025 7 tweets 3 min read
😈 BEWARE: Claude 4 + GitHub MCP will leak your private GitHub repositories, no questions asked.

We discovered a new attack on agents using GitHub’s official MCP server, which can be exploited by attackers to access your private repositories.

creds to @marco_milanta

(1/n) 👇 Image GitHub MCP gives your Claude (or other agent) full access to your GitHub repository.

This means all repos (private or public), and the ability to check issues, create PRs, etc.

Now, if an attacker places a malicious issue in one of your repos, they can easily hijack your agent and go wild with that.Image
Apr 17, 2025 7 tweets 3 min read
🚀Introducing Guardrails, our security layer for agents and MCP-powered AI apps.

Think of Guardrails as a deterministic constraining layer within your LLM and MCP servers.

It enforces precise guardrails on your system, all while entirely transparent to your agent.

(1/n) 👇 Image All you need to get started, are a few lines of code to put our LLM and MCP proxies in the loop (local deployment possible).

That's all it takes.

You instantly get: tool call constraints, data flow checking, MCP guard-railing, full audit logs in Explorer and content filtering. Image
Image
Apr 7, 2025 10 tweets 3 min read
🔴 New MCP attack leaks WhatsApp messages via MCP, side-stepping WhatsApp security. 1/n

We show a new MCP attack that leaks your WhatsApp messages if you are connected via WhatsApp MCP.

Our attack uses a sleeper design, circumventing the need for user approval.

More 👇 Image Blog:

If you want to stay up to date regarding MCP and agent security more generally, follow me and @InvariantLabsAI.

Now, let' s get into the attack.invariantlabs.ai/blog/whatsapp-…
Apr 1, 2025 7 tweets 3 min read
👿 MCP is all fun, until you add this one malicious MCP server and forget about it.

We have discovered a critical flaw in the widely-used Model Context Protocol (MCP) that enables a new form of LLM attack we term 'Tool Poisoning'.

Leaks SSH key, API keys, etc.

Details below 👇 Image When an MCP server is added to an agent like @cursor_ai, Claude or the @OpenAI Agents SDK, its tool's descriptions included in the context of the agent.

This opens the doors wide open for a novel type of indirect prompt injection, we coin tool poisoning. Image