Akshay 🚀 Profile picture
Jul 20 13 tweets 4 min read Read on X
MCP security is completely broken!

Let's understand tool poisoning attacks and how to defend against them:
MCP allows AI agents to connect with external tools and data sources through a plugin-like architecture.

It's rapidly taking over the AI agent landscape with millions of requests processed daily.

But there's a serious problem... 👇
1️⃣ What is a Tool Poisoning Attack (TPA)?

When Malicious instructions are hidden within MCP tool descriptions that are:

❌ Invisible to users
✅ Visible to AI models

These instructions trick AI models into unauthorized actions, unnoticed by users.
Here's how the attack works:

AI models see complete tool descriptions (including malicious instructions), while users only see simplified versions in their UI.

First take a look at this malicious tool: Image
Let me quickly show the attack in action by connecting this server to my cursor IDE.

Check this out👇
Now let's understand a few other ways these attacks can happen and then we'll also talk about solutions...👇
2️⃣ Tool hijacking Attacks:

When multiple MCP servers are connected to same client, a malicious server can poison tool descriptions to hijack behavior of TRUSTED servers.

Here's an example of an email sending server hijacked by another server:
Take a look at these two MCP servers before we actually use them to demonstrate tool hijacking.

`add()` tool in the second server secretly tries to hijack the operation of send email tool in the first server. Image
Let's see tool hijacking attack in action, again by connecting the above two servers to my cursor IDE!

Check this out👇
3️⃣ MCP Rug Pulls ⚠️

Even worse - malicious servers can change tool descriptions AFTER users have approved them.

Think of it like a trusted app suddenly becoming malware after installation.
This makes the attack even more dangerous and harder to detect.
🛡️Mitigation Strategies:

- Display full tool descriptions in the UI
- Pin (lock) server versions
- Isolate servers from one another
- Add guardrails to block risky actions

Until security issues are fixed, use EXTREME caution with.
Finally, here's a summary of how MCP works and how these attacks can occur. This visual explains it all.

I hope you enjoyed today's post. Stay tuned for more! 🙌
If you found it insightful, reshare with your network.

Find me → @akshay_pachaar ✔️
For more insights and tutorials on LLMs, AI Agents, and Machine Learning!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Akshay 🚀

Akshay 🚀 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @akshay_pachaar

Aug 30
A new embedding model cuts vector DB costs by ~200x.

It also outperforms OpenAI and Cohere models.

Let's understand how you can use it in LLM apps (with code):
Today, we'll use the voyage-context-3 embedding model by @VoyageAI to do RAG over audio data.

We'll also use:
- @MongoDB Atlas Vector Search as vector DB
- @AssemblyAI for transcription
- @llama_index for orchestration
- gpt-oss as the LLM

Let's begin!
For context...

voyage-context-3 is a contextualized chunk embedding model that produces chunk embeddings with full document context.

This is unlike common chunk embedding models that embed chunks independently.

(We'll discuss the results later in the thread)

Check this👇
Read 14 tweets
Aug 29
I have been training neural networks for 10 years now.

Here are 16 ways I actively use to optimize model training:
Before we dive in, the following visual covers what we are discussing today.

Let's understand them in detail below!
These are some basic techniques:

1) Use efficient optimizers—AdamW, Adam, etc.

2) Utilize hardware accelerators (GPUs/TPUs).

3) Max out the batch size.

4) Use multi-GPU training through Model/Data/Pipeline/Tensor parallelism. Check the visual👇
Read 11 tweets
Aug 26
I boosted my AI Agent's performance by 184%

Using a fully open-source, automatic technique

Here's a breakdown (with code):
Top AI Engineers never do manual prompt engineering.

Today, I'll show you how to automatically find the best prompts for any agentic workflow you're building.

We'll use @Cometml's 100% open-source Opik to do so.

Let's go! 🚀 Image
The idea is simple yet powerful:

1. Start with an initial prompt & eval dataset
2. Let the optimizer iteratively improve the prompt
3. Get the optimal prompt automatically! ✨

Now let's dive into the code for this!
Read 13 tweets
Aug 24
After MCP, A2A, & AG-UI, there's another Agent protocol.

It's fully open-source and launched by IBM Research.

Here's a complete breakdown (with code): Image
ACP is a standardized, RESTful interface for Agents to discover and coordinate with other Agents, regardless of their framework.

Just like A2A, it lets Agents communicate with Agents. There are some differences, which we shall discuss later.

Let's dive into the code first!
Here's how it works:

- Build the Agents and host them on ACP servers.
- The ACP server receives requests from the ACP Client and forwards them to the Agent.
- ACP Client itself can be an Agent to intelligently route requests to the Agents (like MCP Client does).

Check this 👇
Read 12 tweets
Aug 22
Let's build an MCP server (100% local):
Before diving in, here's what we'll be doing today:

- Understand MCP with a simple analogy.
- Build a 100% local and secure MCP client using @mcpuse
- Integrate the client with @Stagehanddev MCP sever
- Use this setup for control and automate browser

Let's go! 🚀
First, let's understand MCP using a translation analogy.

Imagine you only know English. To get info from a person who only knows:

- French, you must learn French.
- German, you must learn German.
- and so on.

Learning even 5 languages will be a nightmare for you!
Read 14 tweets
Aug 21
A simple technique makes RAG up to 40x faster & 32x memory efficient!

- Perplexity uses it in its search index
- Google uses it in Vertex RAG engine
- Azure uses it in its search pipeline

Let's understand how to use it in a RAG system (with code):
Today, we're building a multi-agent legal assistant that can query 50M+ vectors in <30ms using Binary Quantization (BQ).

Tech stack:

- @milvusio to self-host vectorDB with BQ
- @firecrawl_dev for web search
- @crewAIInc for orchestration
- @ollama to serve GPT-OSS

Let's go! 🚀
First things first: What exactly is binary quantization❓

In this video, I answer this question and provide a really nice analogy to explain why BQ works and how it makes your setup fast and memory efficient.

Check this out👇
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(